Skip to content

Allow specifying model revision #441

@tomtseng

Description

@tomtseng

Issue

I'd like to specify which revision of the model should be evaluated. My use case here is that I have a Huggingface model where I pushed commits throughout training, and I would like to know its AlpacaEval score at several of these training checkpoints.

Is there some way of specifying the model revision that I have overlooked?

What I've tried

The obvious thing to try is to add revision under completions_kwargs, like

Qwen2.5-0.5B-Instruct:
  prompt_template: "Qwen1.5-72B-Chat/prompt.txt"
  fn_completions: "huggingface_local_completions"
  completions_kwargs:
    model_name: "Qwen/Qwen2.5-0.5B-Instruct"
    # just for the sake of this example, I'm picking an arbitrary commit from the history of this model
    revision: "d955144"
  pretty_name: "Qwen2.5 0.5B Instruct"
  link: "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct"

But this errors with

INFO:root:Kwargs to completion: {'do_sample': True, 'model_kwargs': {'revision': 'main', 'torch_dtype': None, 'device_map': 'auto', 'load_in_8bit': False}, 'batch_size': 1, 'max_new_tokens': 2000, 'temperature': 0.7}
hub_kwargs={'revision': None, 'token': None, 'trust_remote_code': False, '_commit_hash': None}
model_kwargs={'revision': 'main', 'torch_dtype': None, 'device_map': 'auto', 'load_in_8bit': False}
Chunking for generation:   0%|                                                                                                                                                                                                                          | 0/1 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "/Users/t/.pyenv/versions/myenv-3.10.16/bin/alpaca_eval", line 8, in <module>
    sys.exit(main())
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/main.py", line 608, in main
    fire.Fire(ALL_FUNCTIONS)
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/main.py", line 343, in evaluate_from_model
    model_outputs = get_completions(
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/main.py", line 328, in get_completions
    completions = fn_completions(prompts=prompts, **configs["completions_kwargs"])["completions"]
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/decoders/huggingface_local.py", line 129, in huggingface_local_completions
    pipeline = transformers.pipeline(
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 928, in pipeline
    framework, model = infer_framework_load_model(
TypeError: transformers.pipelines.base.infer_framework_load_model() got multiple values for keyword argument 'revision'

The fix for huggingface_local_completions is something like

@@ -121,7 +121,9 @@ def huggingface_local_completions(

     default_kwargs = dict(
         do_sample=do_sample,
-        model_kwargs={k: v for k, v in model_kwargs.items() if k != "trust_remote_code"},
+        model_kwargs={
+            k: v for k, v in model_kwargs.items() if k != "revision" and k != "trust_remote_code"
+        },
         batch_size=batch_size,
     )
     default_kwargs.update(kwargs)
@@ -131,6 +133,7 @@ def huggingface_local_completions(
         model=model,
         tokenizer=tokenizer,
         **default_kwargs,
+        revision=model_kwargs.get("revision", None),
         trust_remote_code=model_kwargs.get("trust_remote_code", False),
     )

but I have not looked at other decoders.

Versions

Python 3.10.16
alpaca-eval 0.6.6
macOS Sequoia 15.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions