The maverick-coref-de Python package provides an easy API to use German Maverick models, enabling efficient and accurate coreference resolution with few lines of code.
Install the library from PyPI
pip install maverick-coref-deor from source
git clone https://github.com/uhh-lt/maverick-coref-de.git
cd maverick-coref-de
pip install -e .Maverick models can be loaded using huggingface_id or local path:
from maverick_de import Maverick
model = Maverick(
hf_name_or_path = "maverick_hf_name" | "maverick_ckpt_path", default = "fynnos/maverick-mes-de10"
device = "cpu" | "cuda", default = "cuda:0"
)You can use model.predict() to obtain coreference predictions. For a sample input, the model will a dictionary containing:
tokens, word tokenized version of the input.clusters_token_offsets, a list of clusters containing mentions' token offsets.clusters_text_mentions, a list of clusters containing mentions in plain text.
Create a Python venv and install from source.
git clone https://github.com/uhh-lt/maverick-coref-de.git
cd maverick-coref-de
pip install -e .- Obtain data in
.conllformat split into train/dev/test - Run the
minimize.pyscript fromdatafor the correct dataset - Adjust
conf/data/<your dataset>.yamlfor your dataset - Adjust
conf/model/mes/<your encoder model>.yamlto - Adjust
conf/root.yamlto use the your dataset and your encoder model - Run
CUDA_VISIBLE_DEVICES=X python maverick_de/train.py
If you use this software, please consider citing our paper published at KONVENS 2025:
@inproceedings{petersenfrey-etal-2025-efficient,
title = "Efficient and effective coreference resolution for German",
author = "Petersen-Frey, Fynn and Hatzel, Hans Ole and Biemann, Chris",
booktitle = "Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025). Volume 1: Long and Short Papers",
month = "9",
year = "2025",
address = "Hildesheim, Germany",
publisher = "KONVENS 2025 Organizers"
}The software in this repository is based on the on the work "Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends" by Giuliano Martinelli, Edoardo Barba, and Roberto Navigli published at ACL 2024 main conference. It uses their implementation forked from the original repository with some adaptions to a) make it compatible with German and b) try additional model variants. For English, refer to the original python package.
The data and software are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0.