-
Notifications
You must be signed in to change notification settings - Fork 4
TitanPet
This page documents a sub-task of the HPC adaptation project at UiO; please see the TitanTop page for background.
[http://www.delph-in.net/pet PET] (Platform for Efficient HPSG Processing Techniques) is the egine used commonly for batch parsing with DELPH-IN grammars: given a sequence of natural language utterances (typically sentences), it performs an n-best search for syntactic and semantic analyses according to the constraints provided by the grammar. PET is implemented in C++ (with core routines in pure C) and was originally developed in the mid-1990s.
At the time, PET was relatively carefully optimized, down to levels of memory locality, register utilization, alignment, and instruction pipelining (all targeting `large-memory' UltraSparc hardware of those days). The software has seen substantial enhancements in functionality in recent years, but for almost ten years it has not been carefully profiled and optimized for maximum effiency on modern hardware. At the same time, DELPH-IN grammars have grown substantially in recent years, and the technology is now more standardly applied to relatively large amounts of unrestricted, running text.
This sub-project will apply various application-level and in-kernel profiling tools to analyze and improve resource utilization in PET, both at per-process level and at a per-node level. From cursory, non-systematic observations made so far, for example, it appears that saturation of the memory sub-system can be a limiting factor to utilizing all cores of a standard TITAN node jointly.
The LOGON tree includes a pre-processed input file to PET: uio/titan/cb.yy. This is a short essay comprising rather difficult language, an [http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ open source advocacy essay]. To parse this input, by-bassing itsdb and everything else, run the following:
cheap -t -yy -packing -cm -default-les=all -nsolutions=500 \
-memlimit=1024 -timeout=60 $LOGONROOT/lingo/erg/english.grm \
< $LOGONROOT/uio/titan/cb.yy
it may also be worthwhile simultaenously putting eight of these jobs on a standard TITAN node, to see whether there are scalability issues.
VD staff should obtain a PET installation from source. For use at UiO, we depend on the so-called chart mapping branch of PET, which is available via SVN:
svn co https://pet.opendfki.de/repos/pet/branches/cm pet
Some instructions on compilation are available in the README file that comes with the source tree, as well as on the DELPH-IN PetDependencies page. To get started, it should be sufficient to use Boost and ICU but ignore itsdb and ECL.
Home | Forum | Discussions | Events