Skip to content

2025-07-03: Improved validity of WN-LMF

Latest
Compare
Choose a tag to compare
@simongray simongray released this 03 Jul 08:54
· 6 commits to master since this release
  • The WN-LMF export now passes validation as per the output of the wn command line program.
    • The issues found and the process of elimination has been documented in Github issue #146.
  • Duplicate senses have been merged. Their synsets have been relabeled and relinked to COR.
  • Duplicate forms have been removed.
  • Synset self-references have been removed.
  • The wn:hypernym relation for adjectives that go across part-of-speech boundaries have been replaced with the purpose-made dns:crossPoSHypernym which can be used to make something akin to hypernymic relations across different parts-of-speech.
    • NOTE: other cross-PoS hypernyms have not been modified for this release, though they have been excluded from the WN-LMF format for now as these relations are technically invalid when marked as wn:hypernym.
  • Duplicate ILI links have been excluded from the WN-LMF format.
    • 1194 of the synsets in DanNet are linked to the same resources in the CILI which is not a valid use of the wn:ili relation!
  • Inferred wn:hyponym links are now included in the WN-LMF format.
  • dns:supersense is now wn:lexfile, the equivalent relation in the GWA schema.
    • The English WordNet actually didn't use this relation itself, but will now do so in future releases (per our request).
  • Various smaller manual synset exclusions in the WN-LMF format to satisfy validation requirements.
  • The OEWN labels are now created based on the 2024 version of the Open English WordNet.
  • Synset indegrees have been recalculated.