PetRobustness

Some notes on how to make parsing with Pet more robust.

Unknown words

For both methods, it holds that it is unclear which REL it gets. Also, is it a word or a lexeme? There is no possibility to generate back again from the MRS to the surface string, if unknown word handling has been applied. Such information should be included in the MRS somehow, and this can be achieved in the user-fns.lsp

Stephen points out that partial lexical gaps (is verb, but only

encoded as noun) is not captured by these mechanisms, but this will be solved when the chart mapping is merged into main.

External POS tagger

cheap -default-les

Needs a mapping in the grammar. See also PetInput.

Lexical type prediction

By Yi.

cheap  -predict-les

Maximum Entropy Model has to be created from a script that Yi, that needs a treebanked profile as input. In e.g. pred-lex.tdl, it should be listed which lexical types should be able to be predicted. At least 2000 sentences are needed.

super tagger

* Supertagger, made by Tim and Phil. Still has to be finished and integrated.

chart mapping/ Reg-ex token handling

- Is not merged to the main branch, yet, according to Yi.

Grammar Internal Solutions

Roots (in english.set)
- choose the robust root for greater coverage (in english.set)
Robustness rules/Mal rules

Pet settings

The amount of items that are parsed of a corpus, depends heavily on

the high number of parameters passed onto the parser. Some of those settings involves restricting the search space (like reducing the maximum number of edges). The constraint that Dan uses now is the mem option in cheap (although you should half it, for some strange reason!). This seems to be the most reasonable setting.

Robuster Partial Parsing

- Is not merged to the main branch, yet, according to Yi.

Home | Forum | Discussions | Events

PetRobustness

Unknown words

External POS tagger

Lexical type prediction

super tagger

chart mapping/ Reg-ex token handling

Grammar Internal Solutions

Pet settings

Robuster Partial Parsing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!