Commit e45ed6e
authored
fix: resolve duplicate task names and add safeguards. (#3394)
* fix: resolve duplicate task names and add safeguards.
This commit fixes 144 duplicate task names across MMLU-Redux and Flores translation benchmarks, and adds warnings and safeguards for more deterministic behavior.
Changes:
- Renamed all MMLU-Redux tasks from `mmlu_*_generative` to `mmlu_redux_*_generative` (57 tasks) to avoid conflicts with original MMLU tasks that use different datasets (cais/mmlu vs edinburgh-dawg/mmlu-redux-2.0)
- Fixed Flores translation task duplicates by prefixing with benchmark name (e.g., `flores_ca-pt` → `catalan_bench_flores_ca-pt`). Updated generation scripts and regenerated YAMLs for catalan_bench, portuguese_bench, basque_bench, spanish_bench, and galician_bench
- Added duplicate detection in TaskManager._get_task_and_group() that warns users when duplicate task names are found, showing both file paths for easier debugging
- Made directory walk deterministic by sorting dirs and file_list in os.walk() to ensure consistent task loading order across different filesystems and operating systems
The duplicate MMLU-Redux tasks were particularly problematic as they used different datasets but identical names, causing silent conflicts where users would unknowingly run the wrong benchmark variant.
* Fix tags and task names in group info.1 parent b529c19 commit e45ed6e
File tree
216 files changed
+292
-551
lines changed- lm_eval/tasks
- basque_bench/flores_eu
- catalan_bench/flores_ca
- cmmlu
- galician_bench/flores_gl
- mmlu-redux/generative
- portuguese_bench/flores_pt
- spanish_bench/flores_es
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
216 files changed
+292
-551
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
490 | 490 | | |
491 | 491 | | |
492 | 492 | | |
| 493 | + | |
| 494 | + | |
493 | 495 | | |
494 | 496 | | |
495 | 497 | | |
| |||
528 | 530 | | |
529 | 531 | | |
530 | 532 | | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
531 | 540 | | |
532 | 541 | | |
533 | 542 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
294 | 294 | | |
295 | 295 | | |
296 | 296 | | |
297 | | - | |
| 297 | + | |
298 | 298 | | |
299 | 299 | | |
300 | 300 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
0 commit comments