-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
datasetsDatasets related featuresDatasets related featuresfeature requestNew feature or requestNew feature or requestutilsUtility services like export and evaluationUtility services like export and evaluation
Description
https://github.com/huggingface/lighteval
LightEval is literally amazing -- metrics, tasks, benchmarks, everything built in!!! This will be the replacement for our feature request on adding benchmark and expanding our evaluation capabilities #60
The design is that the user should have two ways to assess their model:
- Using the dataset it is trained on (use the test split of course), user preprocessed, compute metrics and stuff
- Use lighteval for task + benchmarks
I would say migrate from evaluate
to lighteval
since the former seems not as actively maintained as the latter!
Metadata
Metadata
Assignees
Labels
datasetsDatasets related featuresDatasets related featuresfeature requestNew feature or requestNew feature or requestutilsUtility services like export and evaluationUtility services like export and evaluation