added HELMET task with json datasets and tests #971

nayana1729 · 2025-09-17T01:21:24Z

This PR integrates the HELMET evaluation task into lighteval. Files added include:

helmet.py task implementation
json datasets: asqa_revised.json and qampari_revised.json
test_helmet.py tests to check dataset loading and prompt retrieval (uses pytest)

References issue: #731

nayana1729 added 2 commits September 16, 2025 18:00

added helmet.py, test_helmet.py, and dataset files

17624c6

added copyright headers to helmet files

34f75e8

nayana1729 mentioned this pull request Sep 17, 2025

[EVAL] HELMET: long context evals #731

Open

NathanHB added the new-task label Sep 18, 2025

Provide feedback