Skip to content

"You must manually fix the score pair" Error While Using Custom Evaluator #446

@tommycwh

Description

@tommycwh

I am new to AlpacaEval and I am trying to run the evaluation code with a custom evaluator. However, when I run evaluate_from_model with my custom evaluator, it prints messages like "You must manually fix the score pair" and it seems to skip all data entries for evaluation due to this error. I do not understand what these messages mean and I would like to ask for some advice on how to solve this problem.

While the evaluate_from_model procedure is running, it shows messages like:

...
ERROR:root:invalid syntax (<unknown>, line 1)
Content: <|im_start|>import numpy as np

def rank_models(model1, model2):

You must manually fix the score pair.
ERROR:root:invalid syntax (<unknown>, line 1)
Content: <|im_start|>import re
from collections import Counter

def rank_models(model_outputs):

You must manually fix the score pair.

and it seems that all entries are dropped from the evaluation, resulting in NaN win_rate:

INFO:root:drop 805 outputs that are not preferences
INFO:root:drop 805 outputs that are not preferences
INFO:root:Saving all results to results/Mistral-7B-Instruct-v0.2
INFO:root:Not saving the result to the cached leaderboard because precomputed_leaderboard is not a path but <class 'NoneType'>.
                          length_controlled_winrate  win_rate  standard_error  n_total  avg_length
Mistral-7B-Instruct-v0.2                      48.16       NaN             NaN        0        1676

About the generated texts in the automatically generated annotations_seed0_configs.json, it seems that the generated texts are quite natural, but the preference score is always -1. For example:

    "instruction":"What are the names of some famous actors that started their careers on Broadway?",
    "output_1":"Several famous actors started their careers on Broadway before making it big in film and television. Here are a few notable examples:\n\n1. Sarah Jessica Parker - Before she was Carrie Bradshaw on \"Sex and the City,\" Sarah Jessica Parker was a Broadway star, having appeared in productions like \"Annie\" as a child.\n\n2. Meryl Streep - Meryl Streep's early career included Broadway productions such as \"Trelawny of the 'Wells'\" and \"A Memory of Two Mondays \/ 27 Wagons Full of Cotton.\"\n\n3. Hugh Jackman - Hugh Jackman won a Tony Award for his role in \"The Boy from Oz\" and has been known for his stage work as well as his film career.\n\n4. Sutton Foster - Known for her television role in \"Younger,\" Sutton Foster is also a Broadway legend with leading roles in shows like \"Thoroughly Modern Millie\" and \"Anything Goes.\"\n\n5. Kristen Bell - Before she was the voice of Anna in \"Frozen\" or the star of \"The Good Place,\" Kristen Bell appeared in Broadway's \"The Adventures of Tom Sawyer\" and \"The Crucible.\"\n\n6. Audra McDonald - Audra McDonald is a renowned Broadway actress with a record-breaking number of Tony Awards. She's starred in \"Ragtime,\" \"Carousel,\" \"Master Class,\" and more.\n\n7. Nathan Lane - Nathan Lane is a Broadway veteran known for his roles in \"The Producers,\" \"A Funny Thing Happened on the Way to the Forum,\" and \"Angels in America.\"\n\n8. Idina Menzel - Before \"Frozen\" and \"Wicked\" made her a household name, Idina Menzel started on Broadway in shows like \"Rent\" and \"Hair.\"\n\n9. Lin-Manuel Miranda - Before \"Hamilton\" and \"In the Heights\" became huge hits, Lin-Manuel Miranda was performing on Broadway, eventually becoming a celebrated writer and actor.\n\n10. Lea Michele - Prior to her role on \"Glee,\" Lea Michele was a young Broadway actress in shows like \"Les Mis\u00e9rables,\" \"Ragtime,\" and \"Spring Awakening.\"\n\nThese actors are just a few examples of the many performers who have transitioned from the Broadway stage to broader fame in the entertainment industry. Broadway often serves as a proving ground for talent, and many actors continue to return to the stage throughout their careers.",
    "output_2":"Many talented actors have transitioned from Broadway to successful film and television careers. Here are some famous examples:\n\n1. Julie Andrews: Starred in Broadway's \"My Fair Lady\" (1956) and \"Camelot\" (1960) before becoming a Hollywood icon in movies like \"The Sound of Music\" (1965) and \"Mary Poppins\" (1964).\n2. Hugh Jackman: Originated the role of Curly McLain in Broadway's \"Oklahoma!\" (1998) and later starred in films like \"X-Men\" (2000), \"Les Mis\u00e9rables\" (2012), and \"The Greatest Showman\" (2017).\n3. Neil Patrick Harris: Got his start on Broadway in \"Assassins\" (1990) and \"Rent\" (1996) before becoming known for his TV roles in \"Doogie Howser, M.D.\" and \"How I Met Your Mother.\"\n4. Idina Menzel: Originated the role of Maureen Johnson in Broadway's \"Rent\" (1996) and later voiced Elsa in Disney's \"Frozen\" (2013) and starred in TV's \"Glee.\"\n5. Lin-Manuel Miranda: Created and starred in Broadway's \"In the Heights\" (2008) and \"Hamilton\" (2015), earning multiple Tony Awards and later appearing in films like \"Moana\" (2016) and TV's \"Curb Your Enthusiasm.\"\n6. Nathan Lane: Began his career on Broadway in \"Noises Off\" (1983) and \"Guys and Dolls\" (1992), followed by notable film roles in \"The Birdcage\" (1996) and \"The Producers\" (2005).\n7. Patti LuPone: Made her Broadway debut in \"Applause\" (1970) and won a Tony Award for \"Evita\" (1979), later appearing in TV shows like \"Glee\" and films like \"Witness\" (1985).\n8. Alan Cumming: Originated the role of Emcee in Broadway's \"Cabaret\" (1998) and has since appeared in films like \"X2: X-Men United\" (2003), \"Spy Kids\" (2001), and TV's \"The Good Wife.\"\n9. Sutton Foster: Starred in Broadway's \"Thoroughly Modern Millie\" (2002) and \"Shrek the Musical\" (2008), later landing roles in TV's \"Bunheads\" and \"Younger.\"\n10. Laura Benanti: Made her Broadway debut in \"My Fair Lady\" (1993) and has since appeared in TV shows like \"The Good Wife,\" \"Nashville,\" and films like \"No Strings Attached\" (2011).\n\nThese are just a few examples of talented actors who got their start on Broadway and went on to achieve success in film and television.",
    "annotator":"alpaca_eval_llama32_3b_instruct",
    "preference":-1.0
  },

I tried to use meta-llama/Llama-3.2-3B-Instruct from HuggingFace as the custom evaluator, and tried to use it to evaluate the Mistral-7B-Instruct-v0.2 and Llama-3-Instruct-8B-SimPO models from AlpacaEval, but the results are similar as discussed above.

The custom evaluator config file alpaca_eval_llama32_3b_instruct/configs.yaml I used is:

alpaca_eval_llama32_3b_instruct:
  prompt_template: "alpaca_eval_llama32_3b_instruct/alpaca_eval_fn.txt"
  fn_completions: "huggingface_local_completions"
  completions_kwargs:
    model_name: "meta-llama/Llama-3.2-3B-Instruct"
    temperature: 0
    model_kwargs:
      torch_dtype: 'bfloat16'
  fn_completion_parser: "ranking_parser"
  batch_size: 1

The evaluation command I used is:

alpaca_eval evaluate_from_model \
  --model_configs 'Mistral-7B-Instruct-v0.2' \
  --annotators_config './alpaca_eval/src/alpaca_eval/evaluators_configs/alpaca_eval_llama32_3b_instruct/configs.yaml'

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions