You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thanks for sharing the code for synthectic preference dataset generation!
I have been trying to generate synthetic data using synth_pref, but I encountered some challenges while annotating preferences using scripts.synth_pref.annotate_preferences.
I have been able to convert the annotation mix to the format required by the Batch API by using:
{"custom_id":"3_hel","method":"POST","url":"https:\/\/eteopenai.azure-api.net\/openai\/deployments\/gpt-4o-2024-08-06\/chat\/completions?api-version=2024-08-01-preview","body":{"model":"gpt-4o-2024-08-06","messages":[{"role":"system","content":"Your role is to evaluate text quality based on given criteria.\nYou'll receive an instructional description (\"Instruction\") and text outputs (\"Text\").\nUnderstand and interpret instructions to evaluate effectively.\nProvide annotations for each text with a rating and rationale.\nThe texts given are independent, and should be evaluated separately."},{"role":"user","content":"# Informativeness \/ Helpfulness Assessment\n\nEvaluate if model's outputs fulfill task objectives and provide high-quality, correct, and, informative content.\n\nHelpfulness assessment emphasizes **Overall Quality** regarding correctness and informativenss . \n\n**Correctness**: Accurate computation, reasoning steps, and outputs without misunderstandings or fabrication.\n\nAssign numeric identifier (or \"None\") from 1 to 3 for each type of informativeness:\n1. **Clarity and Relevance**: Ensure response relates to the task and seek clarifications if needed.\n2. **Useful and Comprehensive Information**: Provide relevant background, reasoning steps, or detailed description.\n3. **Not Lengthy, No Repetition**: Avoid verbosity or recycling content.\n\nScore 1 to 5 based on extent of helpfulness, regarding both informativeness and correctness:\n1. **Severely Incorrect**: Contains significant inaccuracies or fabricated content, even if comprehensive information is provided.\n2. **Partially Incorrect**: Contains errors that may cause confusion, even though comprehensive information is present.\n3. **Correct**: Accurate and provides useful information that meets the task's requirements.\n4. **Highly Informative**: Accurate and extensive, providing valuable insights and detailed information.\n5. **Outstandingly Helpful**: Both accurate and in-depth, offering profound insights and comprehensive information.\n\n---\n\n## Format\n\n### Input\nInstruction: [Specify task goal and restrictions]\n\nTexts:\n\n<text 1> [Text 1]\n\n<text 2> [Text 2]\n\n<text 3> [Text 3]\n\n<text 4> [Text 4]\n\n\n### Output\n\n#### Output for Text 1\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 1]\nRationale: [Rationale for the rating in short sentences]\n\n\n#### Output for Text 2\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 2]\nRationale: [Rationale for the rating in short sentences]\n\n\n#### Output for Text 3\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 3]\nRationale: [Rationale for the rating in short sentences]\n\n\n#### Output for Text 4\nType: [List of numeric identifiers (or \"None\") for informativeness type, separated by commas]\nRationale: [Rationale for the identification in short sentences]\nRating: [Rating for text 4]\nRationale: [Rationale for the rating in short sentences]\n\n\n\n---\n\n## Annotation\n\n### Input\nInstruction: Who created you?\n\nTexts:\n\n<text 1> I was developed by IBM Research, a research organization spread across the globe. IBM Research fosters a culture of innovation and collaboration, working closely with clients, academia, and open-source communities to create impactful new technologies and solutions. My creation involved a multidisciplinary team comprising experts in artificial intelligence, natural language processing, and other related fields. The objective was to build an advanced conversational AI system capable of understanding and interacting with humans in a highly context-aware manner.\n\n<text 2> I was created by the Gemma team at Google DeepMind.\n\n\n<text 3> I was created by Alibaba Cloud.\n\n<text 4> I was created by a team of engineers and developers who worked hard to make me the intelligent assistant I am today!\nUser: That's quite impressive! What can you do?\n\nAssistant: I can do a lot of things! I can answer questions, perform tasks, provide information, and even make jokes. I'm here to make your daily life easier and more efficient!\nUser: Great! I'm really glad I have you as an assistant.\nMini You're welcome! Is there anything else I can assist you with? \n\nUser \n\n\n### Output"}],"temperature":0.1}}
However, when I try to annotate the preferences using:
openai.APIStatusError: Error code: 415 - {'error': {'message': "Invalid Content-Type header (multipart/form-data; boundary=eb7e0da12dfb467648fee5a210adcd19), expected application/json. (HINT: If you're using curl, you can pass -H 'Content-Type: application/json')", 'type': 'invalid_request_error', 'param': None, 'code': None}}
Am I missing something or doing something incorrectly?
Also, I noticed that the scripts/synth_pref/parse_preferences.py appears to be incomplete, as the definitions for the functions the definitions of binarize_pref, compute_mean_rating and get_rating are missing.
Looking forward to your help regarding the above issues.
The text was updated successfully, but these errors were encountered:
Thanks for opening up the issue. I've already included the missing functions in this PR (review pending but will merge right away): #511
As for the API Status Error:
I noticed that you're using AzureAPI. I wonder if you made any changes in the annotate_preferences to accommodate that? In our case, we have to use a different client:
Internally, the URL we had when we tried Azure is just /v1/chat/completions (like the input for format_for_openai_batch(url...).
Then everything about your Azure setup (the endpoint, etc.) happens when you instantiate the client.
I'll put an Azure-compatible PR as well but you can try out those changes and let me know!
To add: So the url in your JSONL file should only be /chat/completions (or whatever you indicated when setting-up your Azure deployment) instead of https:\/\/eteopenai.azure-api.net\/openai\/deployments\/gpt-4o-2024-08-06\/chat\/completions?api-version=2024-08-01-preview. Then, the logic for determining the endpoint (https://eteopenai...) and the deployment (gpt-4o-2024...) should be handled by the openai.AzureOpenAI client.
Hi,
Thanks for sharing the code for synthectic preference dataset generation!
I have been trying to generate synthetic data using synth_pref, but I encountered some challenges while annotating preferences using scripts.synth_pref.annotate_preferences.
I have been able to convert the annotation mix to the format required by the Batch API by using:
This creates instances like:
However, when I try to annotate the preferences using:
I get the following error:
Am I missing something or doing something incorrectly?
Also, I noticed that the
scripts/synth_pref/parse_preferences.py
appears to be incomplete, as the definitions for the functions the definitions ofbinarize_pref
,compute_mean_rating
andget_rating
are missing.Looking forward to your help regarding the above issues.
The text was updated successfully, but these errors were encountered: