Skip to content

[feat] Integrate LightEval, HuggingFace's new LLM evaluation framework #74

@supreme-gg-gg

Description

@supreme-gg-gg

https://github.com/huggingface/lighteval

LightEval is literally amazing -- metrics, tasks, benchmarks, everything built in!!! This will be the replacement for our feature request on adding benchmark and expanding our evaluation capabilities #60

The design is that the user should have two ways to assess their model:

  • Using the dataset it is trained on (use the test split of course), user preprocessed, compute metrics and stuff
  • Use lighteval for task + benchmarks

I would say migrate from evaluate to lighteval since the former seems not as actively maintained as the latter!

Metadata

Metadata

Assignees

No one assigned

    Labels

    datasetsDatasets related featuresfeature requestNew feature or requestutilsUtility services like export and evaluation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions