This repository contains the configuration and code utilities to run the ItaEval evaluation suite.
- The repository is a fork of the lm-eval-harness. We last aligned on 1980a13.
- The suite is backing a live leaderboard HERE
All the configuration file are under lm_eval/tasks/ita_eval
. We also have included it as a "benchmark" under lm_eval/tasks/benchmarks
.
We release several runner bash scripts to run base and chat models against the suite. Head to bash/
to find them.
Note that the recipes listed in the folder are tailored to our hardware and you will very likely need to adapt them to yours.
In a scenario where all of the dependencies are installed correctly, you should be able to run your model on ItaEval with
MODEL="your-model-id-on-the-huggingface-hub"
lm_eval --model hf \
--model_args pretrained=${MODEL},dtype=bfloat16 \
--tasks ita_eval \
--batch_size 1 \
--log_samples \
--output_path "."
ItaEval and TweetyIta are the results of the joint effort of members of the Risorse per la Lingua Italiana community. We thank every member that dedicated their personal time to the sprints. We thank CINECA for providing the computational resources (ISCRA grant: HP10C3RW9F).