This packages up data for the Open Multilingual Wordnet. It is roughly the version that is described in Bond and Foster (2013).
It includes the data in the original OMW 1.0 format, and packaged up in the GWA format for OMW 2 as a release.
It can be used by the Python library Wn.
The raw data (under wns) also has the automatically extracted data for over 150 languages from Wiktionary and the Unicode Common Locale Data Repository (CLDR).
More information about the Open Multilingual Wordnet
If you use OMW please cite both the citation below, and the individual wordnets (citation data is included in each wordnet):
Francis Bond and Ryan Foster (2013) Linking and extending an open multilingual wordnet. In 51st Annual Meeting of the Association for Computational Linguistics: ACL-2013. Sofia. 1352–1362
The directory wns has the wordnet data from OMW 1.2 with some small fixes
- added a citation for the Icelandic wordnet
- added human readable citations in index.toml
By default the label is the name of the project. If the project has multiple wordnets, then the language is added in parentheses. E.g.:
label = "Multilingual Central Repository (Catalan)"
The package name (and id) for each wordnet is, by default, omw-lg
,
with the following exceptions:
- ItalWordnet will be
omw-iwn
notomw-it
(used by multiwordnet) - COW will just be
omw-cmn
notomw-cmn-Hans
- WN derived from PWN 3.0 will be
omw-en
- WN derived from PWN 3.1 will be
omw-en31
We thanks the developers of all of the wordnets! More recent versions are available for many of these.
Francis Bond and Ryan Foster (2013) Linking and extending an open multilingual wordnet. In 51st Annual Meeting of the Association for Computational Linguistics: ACL-2013. Sofia. 1352–1362