Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite applications of ProgrVP by ud2gf #12

Open
inariksit opened this issue Oct 11, 2021 · 4 comments
Open

Infinite applications of ProgrVP by ud2gf #12

inariksit opened this issue Oct 11, 2021 · 4 comments

Comments

@inariksit
Copy link
Member

I'm running ud2gf with ShallowParse, using "the cat sleeps" as my sentence. Here's the original sentence, produced with parsing "the cat sleeps" in UDpipe, and using this code to output the CoNLLU format.

$ cat /tmp/cat.conllu
1       the     the     DET     _       _       2       det     _       _
2       cat     cat     NOUN    _       _       3       nsubj   _       _
3       sleeps  sleep   VERB    _       _       0       root    _       _

I run ud2gf as follows.

$ cat /tmp/cat.conllu | stack run gf-ud ud2gf grammars/ShallowParse Eng Text at

Infinite loop

First, ud2gf ran for 30 minutes until I stopped it.

Uncomment "beam size" of 123 trees

Next, I uncommented this line, to put back the limitation of max 123 candidate trees. This works, in the sense that ud2gf doesn't get stuck in an infinite loop anymore, but the best tree still contains multiple applications of ProgrVP—despite the original sentence having none. Here's the output:

# bt0, the best (most complete) tree, without backups:
[3] sleeps 3 (2) VERB root (ImpVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (UseV sleep_V))))))))))))))))))))) : Imp[3]) 1
    *[1,2] cat 2 (1) NOUN nsubj (UseN cat_N : CN[2]) 1
        *[1] the 1 (2) DET det (the_Det : Det[1]) 1

# at, final GF tree, macros expanded:
AddBackupImp (ConsBackup (CNBackup (AddBackupCN (ConsBackup (DetBackup the_Det) BaseBackup) (UseN cat_N))) BaseBackup) (ImpVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (UseV sleep_V))))))))))))))))))))))

Adding annotations to the conllu file

I have noticed before that I get weird trees if the file is missing morphological annotations. So I added them manually to the CoNLLU file:

$ cat /tmp/cat-annotated.conllu
1	the	the	DET	Det	FORM=0	2	det	_	_
2	cat	cat	NOUN	N	Number=Sing	3	nsubj	_	_
3	sleeps	sleep	VERB	V	Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin	0	root	_	_

With this file, we now get a correct tree with MiniLang:

# MiniLang with cat.conllu (which is missing annotations)
AddBackupImp (ConsBackup (CNBackup (AddBackupCN (ConsBackup (TheBackup the_The) BaseBackup) (UseN cat_N))) BaseBackup) (ImpVP (UseV sleep_V))

# MiniLang with cat-annotated.conllu
PredVP (DetCN the_Det (UseN cat_N)) (UseV sleep_V)

But with ShallowParse, the tree is as wrong as ever, with multiple ProgrVPs.

# ShallowParse with cat-annotated.conllu
AddBackupImp (ConsBackup (CNBackup (AddBackupCN (ConsBackup (DetBackup thePl_Det) BaseBackup) (UseN cat_N))) BaseBackup) (ImpVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (ProgrVP (UseV sleep_V))))))))))))))))))))))

So it seems unlikely that the ProgrVP loop is due to user error/insufficiently annotated CoNLLU files.

Workaround

ProgrVP is the only function in ShallowParse of type a -> a, so I can just comment it out in the GF grammar. But of course, sometimes such functions are actually needed, so this is not a real solution.

@inariksit
Copy link
Member Author

I notice that in the ShallowParse.labels file, there is this line

#disable UseComp MkVPS PositA UseComparA ProgrVP ExtAdvS UttImpSg ImpVP PassVP 

But it doesn't seem to do anything—I get stuff like UseComparA even when running the test.conllu file, resulting in sentences like "the blacker cat", when the original text is "the black cat"

@aarneranta
Copy link
Contributor

You seem to have found a bug or two. It sounds, as you say, like #disable is not implemented as it should.

@anka-213
Copy link
Member

@aarneranta #Disable does work, but only in the concrete labels file (as the documentation says). If it's in the abstract labels file like it is for some of the examples in the repo, it is silently ignored.

@inariksit
Copy link
Member Author

11d9ef0 fixes this problem, so we can close the issue once it's merged in master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants