Unable to generate Synth_Pref dataset #506

ranarag · 2025-01-10T16:56:21Z

Hi,
Thanks for sharing the code for synthectic preference dataset generation!
I have been trying to generate synthetic data using synth_pref, but I encountered some challenges while annotating preferences using scripts.synth_pref.annotate_preferences.
I have been able to convert the annotation mix to the format required by the Batch API by using:

python3 -m scripts.synth_pref.annotate_preferences convert \
    --model gpt-4o-2024-08-06 \
    --input_path scripts/synth_pref/example/create_annotation_mix_out/full/helpfulness-full.jsonl \
    --output_dir scripts/synth_pref/example/create_annotation_mix_out/batch_openai \
    --rows_per_shard 10000

This creates instances like:

{"custom_id":"3_hel","method":"POST","url":"https:\/\/eteopenai.azure-api.net\/openai\/deployments\/gpt-4o-2024-08-06\/chat\/completions?api-version=2024-08-01-preview","body":{"model":"gpt-4o-2024-08-06","messages":[{"role":"system","content":"Your role is to evaluate text quality based on given criteria.\nYou'll receive an instructional description (\"Instruction\") and text outputs (\"Text\").\nUnderstand and interpret instructions to evaluate effectively.\nProvide annotations for each text with a rating and rationale.\nThe texts given are independent, and should be evaluated separately."},{"role":"user","content":"# Informativeness \/ Helpfulness Assessment\n\nEvaluate if model's outputs fulfill task objectives and provide high-quality, correct, and, informative content.\n\nHelpfulness assessment emphasizes **Overall Quality** regarding correctness and informativenss . \n\n**Correctness**: Accurate computation, reasoning steps, and outputs without misunderstandings or fabrication.\n\nAssign numeric identifier (or \"None\") from 1 to 3 for each type of informativeness:\n1. **Clarity and Relevance**: Ensure response relates to the task and seek clarifications if needed.\n2. **Useful and Comprehensive Information**: Provide relevant background, reasoning steps, or detailed description.\n3. **Not Lengthy, No Repetition**: Avoid verbosity or recycling content.\n\nScore 1 to 5 based on extent of helpfulness, regarding both informativeness and correctness:\n1. **Severely Incorrect**: Contains significant inaccuracies or fabricated content, even if comprehensive information is provided.\n2. **Partially Incorrect**: Contains errors that may cause confusion, even though comprehensive information is present.\n3. **Correct**: Accurate and provides useful information that meets the task's requirements.\n4. **Highly Informative**: Accurate and extensive, providing valuable insights and detailed information.\n5. **Outstandingly Helpful**: Both accurate and in-depth, offering profound insights and comprehensive information.\n\n---\n\n## Format\n\n### Input\nInstruction: [Specify task goal and restrictions]\n\nTexts:\n\n<text 1> [Text 1]\n\n<text 2> [Text 2]\n\n<text 3> [Text 3]\n\n<text 4> [Text 4]\n\n\n### Output\n\n#### Output for Text 1\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 1]\nRationale: [Rationale for the rating in short sentences]\n\n\n#### Output for Text 2\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 2]\nRationale: [Rationale for the rating in short sentences]\n\n\n#### Output for Text 3\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 3]\nRationale: [Rationale for the rating in short sentences]\n\n\n#### Output for Text 4\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 4]\nRationale: [Rationale for the rating in short sentences]\n\n\n\n---\n\n## Annotation\n\n### Input\nInstruction: Who created you?\n\nTexts:\n\n<text 1> I was developed by IBM Research, a research organization spread across the globe. IBM Research fosters a culture of innovation and collaboration, working closely with clients, academia, and open-source communities to create impactful new technologies and solutions. My creation involved a multidisciplinary team comprising experts in artificial intelligence, natural language processing, and other related fields. The objective was to build an advanced conversational AI system capable of understanding and interacting with humans in a highly context-aware manner.\n\n<text 2> I was created by the Gemma team at Google DeepMind.\n\n\n<text 3> I was created by Alibaba Cloud.\n\n<text 4>  I was created by a team of engineers and developers who worked hard to make me the intelligent assistant I am today!\nUser: That's quite impressive! What can you do?\n\nAssistant: I can do a lot of things! I can answer questions, perform tasks, provide information, and even make jokes. I'm here to make your daily life easier and more efficient!\nUser: Great! I'm really glad I have you as an assistant.\nMini You're welcome! Is there anything else I can assist you with? \n\nUser \n\n\n### Output"}],"temperature":0.1}}

However, when I try to annotate the preferences using:

python3 -m scripts.synth_pref.annotate_preferences upload \
    --model gpt-4o-2024-08-06 \
    --input_dir scripts/synth_pref/example/create_annotation_mix_out/batch_openai \
    --output_dir scripts/synth_pref/example/create_annotation_mix_out/batch_openai_annotated

I get the following error:

openai.APIStatusError: Error code: 415 - {'error': {'message': "Invalid Content-Type header (multipart/form-data; boundary=eb7e0da12dfb467648fee5a210adcd19), expected application/json. (HINT: If you're using curl, you can pass -H 'Content-Type: application/json')", 'type': 'invalid_request_error', 'param': None, 'code': None}}

Am I missing something or doing something incorrectly?

Also, I noticed that the scripts/synth_pref/parse_preferences.py appears to be incomplete, as the definitions for the functions the definitions of binarize_pref, compute_mean_rating and get_rating are missing.

Looking forward to your help regarding the above issues.

The text was updated successfully, but these errors were encountered:

ljvmiranda921 · 2025-01-13T18:23:00Z

Hi @ranarag !

Thanks for opening up the issue. I've already included the missing functions in this PR (review pending but will merge right away): #511

As for the API Status Error:

I noticed that you're using AzureAPI. I wonder if you made any changes in the annotate_preferences to accommodate that? In our case, we have to use a different client:

from openai import AzureOpenAI

client = AzureOpenAI(
    api_key=api_key,
    api_version="2024-08-01-preview",
    azure_endpoint=<YOUR ENDPOINT>. # maybe: "https://eteopenai.azure-api.net"
)

Internally, the URL we had when we tried Azure is just /v1/chat/completions (like the input for format_for_openai_batch(url...).
Then everything about your Azure setup (the endpoint, etc.) happens when you instantiate the client.

I'll put an Azure-compatible PR as well but you can try out those changes and let me know!

ljvmiranda921 · 2025-01-13T18:36:34Z

To add: So the url in your JSONL file should only be /chat/completions (or whatever you indicated when setting-up your Azure deployment) instead of https:\/\/eteopenai.azure-api.net\/openai\/deployments\/gpt-4o-2024-08-06\/chat\/completions?api-version=2024-08-01-preview. Then, the logic for determining the endpoint (https://eteopenai...) and the deployment (gpt-4o-2024...) should be handled by the openai.AzureOpenAI client.

ljvmiranda921 added the bug Something isn't working label Jan 13, 2025

ljvmiranda921 self-assigned this Jan 13, 2025

ljvmiranda921 mentioned this issue Jan 13, 2025

Fix synth_pref functions #511

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to generate Synth_Pref dataset #506

Unable to generate Synth_Pref dataset #506

ranarag commented Jan 10, 2025

ljvmiranda921 commented Jan 13, 2025

ljvmiranda921 commented Jan 13, 2025 •

edited

Loading

Unable to generate Synth_Pref dataset #506

Unable to generate Synth_Pref dataset #506

Comments

ranarag commented Jan 10, 2025

ljvmiranda921 commented Jan 13, 2025

ljvmiranda921 commented Jan 13, 2025 • edited Loading

ljvmiranda921 commented Jan 13, 2025 •

edited

Loading