Skip to content

Incorrect sentence parsing using ja_core_news_trf #12106

Answered by polm
e-e asked this question in Help: Other Questions
Discussion options

You must be logged in to vote

In general issues like this fall under #3052, which basically amounts to "the models make mistakes sometimes". If the mistake is common and follows a clear pattern that might point to a fixable issue. In this case, there does seem to be something weird about how compound verbs are handled, so we'll take a closer look at that.

Note that if your goal is actually just sentence segmentation for Japanese, you should get high quality results with a punctuation-based sentencizer instead of relying on the default sentence boundaries, which are based on the parse tree.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by adrianeboyd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang / ja Japanese language data and models feat / parser Feature: Dependency Parser perf / accuracy Performance: accuracy
2 participants
Converted from issue

This discussion was converted from issue #12099 on January 16, 2023 07:13.