Skip to content

Conversation

LysandreJik
Copy link
Member

@LysandreJik LysandreJik commented Sep 9, 2025

This PR proposes a change in the way tasks and suites are managed. It's a proof of concept that isn't extensively tested at this point, in order to check whether this would make sense to pursue.

This PR's objective is twofold:

  • num_fewshots is now inferred to 0 by default if not specified.
  • suite is now automatically selected when a single suite contains the task requested. If the task is handled by more than one suite, an error is thrown.

Example of commands that would work:

- lighteval accelerate model_name=gpt2 "extended|mt_bench|0"
+ lighteval accelerate model_name=gpt2 mt_bench
- lighteval accelerate model_name=gpt2 "leaderboard|truthfulqa:mc|0"
+ lighteval accelerate model_name=gpt2 truthfulqa:mc

Example of commands that would not work:

lighteval accelerate model_name=gpt2 gsm8k  
❌ # ValueError: More than one suite available for task gsm8k: ['leaderboard', 'lighteval']

lighteval accelerate model_name=gpt2 "leaderboard|gsm8k"
✅

Caveat: at this time this is imperfect due to multimodal and community suites being opt-in. Specifying only the taskname of a task in those suites will result in the error that was here before.

The extended suites would have been there as well, but as seen with @NathanHB, we chose to remove the protection for extended tasks at this time; in case this remains useful, I'm happy to revert that part of the code.

Comment on lines +277 to +299
def _get_suite_from_task(self):
registry = self._load_full_registry()
task_name_to_suite = {}
for task in registry.keys():
suite, task_name = task.rsplit("|", 1)
if task_name not in task_name_to_suite:
task_name_to_suite[task_name] = [suite]
else:
task_name_to_suite[task_name].append(suite)

return task_name_to_suite

def _infer_suite_name_from_task(self, taskname):
suite_from_task = self._get_suite_from_task()

if taskname not in suite_from_task:
raise ValueError(f"Requested task {taskname} is not available")

if len(suite_from_task[taskname]) > 1:
raise ValueError(f"More than one suite available for task {taskname}: {suite_from_task[taskname]}")

else:
return suite_from_task[taskname][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we keeping the suite concept at all?

if task.count("|") == 3:
logger.warning(
"Deprecation warning: You provided 4 arguments in your task name, but we no longer support the `truncate_fewshot` option. We will ignore the parameter for now, but it will fail in a couple of versions, so you should change your task name to `suite|task|num_fewshot`."
"Deprecation warning: You provided 4 arguments in your task name, but we no longer support the `truncate_fewshot` option. We will ignore the parameter for now, but it will fail in a couple of versions, so you should change your task name to `suite|task` or `suit|task|num_fewshot`."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suit -> suite

Comment on lines 298 to +358
if task.count("|") == 3:
logger.warning(
"Deprecation warning: You provided 4 arguments in your task name, but we no longer support the `truncate_fewshot` option. We will ignore the parameter for now, but it will fail in a couple of versions, so you should change your task name to `suite|task|num_fewshot`."
"Deprecation warning: You provided 4 arguments in your task name, but we no longer support the `truncate_fewshot` option. We will ignore the parameter for now, but it will fail in a couple of versions, so you should change your task name to `suite|task` or `suit|task|num_fewshot`."
)
suite_name, task_name, few_shot, _ = tuple(task.split("|"))
else:
elif task.count("|") == 2:
suite_name, task_name, few_shot = tuple(task.split("|"))
elif task.count("|") == 1:
suite_name, task_name = tuple(task.split("|"))
few_shot = 0
elif task.count("|") == 0:
suite_name = None
task_name = task
few_shot = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We initially had a system like this which we removed (managing optional cases) - I'm not extremely fond of going back that way, I think we should drop the suites "as such" entirely and consider them a part of the task name for community etc (¢ommunity__task_name)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants