Skip to content

Lemma model can select rule for wrong pos type #1

@bjascob

Description

@bjascob

For the test case 'quilting/NOUN' and 'plastering/NOUN', the words are not in the lemma lookup so OOV rules are called.

getAllLemmasOOV('quilting`, 'NOUN')` returns 'quilt' (it selects rule "ing,,False")
getAllLemmasOOV('plastering`, 'NOUN') returns 'plastering' (it selects rule ",,False")

In the case of 'quilting' the model selects a verb rule. To prevent this consider...

  • Add hard-coded rules to choose the next best if the rule doesn't apply
  • Split the model into 3 parts (verb, noun, adj/adv) and run separately
  • Add contra-cases to training data so it learns not to do this

In addition, the model classes include the ending letters to remove. However, similar above, there is nothing to prevent it selecting a "remove ing" rule for a word ending in something else. I'm not aware of this causing issues but it should be investigated when looking into the first issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    holdNo planned change at this time

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions