German Maverick Coref

Python Package

The maverick-coref-de Python package provides an easy API to use German Maverick models, enabling efficient and accurate coreference resolution with few lines of code.

Install the library from PyPI

pip install maverick-coref-de

or from source

git clone https://github.com/uhh-lt/maverick-coref-de.git
cd maverick-coref-de
pip install -e .

Loading a Pretrained Model

Maverick models can be loaded using huggingface_id or local path:

from maverick_de import Maverick
model = Maverick(
  hf_name_or_path = "maverick_hf_name" | "maverick_ckpt_path", default = "fynnos/maverick-mes-de10"
  device = "cpu" | "cuda", default = "cuda:0"
)

Inference

Predict

You can use model.predict() to obtain coreference predictions. For a sample input, the model will a dictionary containing:

tokens, word tokenized version of the input.
clusters_token_offsets, a list of clusters containing mentions' token offsets.
clusters_text_mentions, a list of clusters containing mentions in plain text.

Training

Create a Python venv and install from source.

git clone https://github.com/uhh-lt/maverick-coref-de.git
cd maverick-coref-de
pip install -e .

Obtain data in .conll format split into train/dev/test
Run the minimize.py script from data for the correct dataset
Adjust conf/data/<your dataset>.yaml for your dataset
Adjust conf/model/mes/<your encoder model>.yaml to
Adjust conf/root.yaml to use the your dataset and your encoder model
Run CUDA_VISIBLE_DEVICES=X python maverick_de/train.py

Citation

If you use this software, please consider citing our paper published at KONVENS 2025:

@inproceedings{petersenfrey-etal-2025-efficient,
    title = "Efficient and effective coreference resolution for German",
    author = "Petersen-Frey, Fynn and Hatzel, Hans Ole and Biemann, Chris",
    booktitle = "Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025). Volume 1: Long and Short Papers",
    month = "9",
    year = "2025",
    address = "Hildesheim, Germany",
    publisher = "KONVENS 2025 Organizers"
}

The software in this repository is based on the on the work "Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends" by Giuliano Martinelli, Edoardo Barba, and Roberto Navigli published at ACL 2024 main conference. It uses their implementation forked from the original repository with some adaptions to a) make it compatible with German and b) try additional model variants. For English, refer to the original python package.

License

The data and software are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
conf		conf
data		data
maverick_de		maverick_de
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

German Maverick Coref

Python Package

Loading a Pretrained Model

Inference

Predict

Training

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

uhh-lt/maverick-coref-de

Folders and files

Latest commit

History

Repository files navigation

German Maverick Coref

Python Package

Loading a Pretrained Model

Inference

Predict

Training

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages