An online interface of this resource index is also available HERE.
This repository collects resources for NLP in Hebrew, as part of the NLPH project, which you can read more about here. Resources are divided into several key sections, each detailed in its own document:
- Corpora and Data Resources: Includes available corpora, annotated datasets, lexicons, dictionaries, word lists, and word embeddings.
- Models, Tools, and Services: Includes models and tools as well as commercial and online services.
- Additional Resources: A compilation of other resources such as annotation tools, collaborative projects, evaluation metrics, benchmark datasets and directories of labs & researchers, along with educational materials like courses, presentations, and meetups.
If you have a resource you can contribute, to be released under some open license, please submit a pull request, or contact us at [email protected]. See here for a list of companies operating in the field.
This specific document is meant to be a list of Hebrew NLP resources, both for general use and to be used as reference when discussing what existing tools can be opened, adapted or integrated to help create a good open source foundation for NLP in Hebrew, as part of the NLPH Project.
When contributing to the list, please add a link to the license for all non-paper resources, e.g. {`AGPL-3.0`_}, {?} for an unkonwn licesnse or {X} for unreleased/closed/copyrighted resources. For code resource, please also add the main language in which the tool is written, e.g. [Python] or [?] for an unknown programming language. Please add hosting mirrors with pointy brackets, e.g. <Zenodo mirror>.