Open
Description
Hello, thanks for your nice work. While reproducing your work, I found some differences between my selected_data/mmlu/top_p0.05.jsonl
and the reference processed data mmlu-chat_adam_sim_trainp0.05_seed3_p0.05.jsonl
. I found that the length of my selected data is 13525, while the reference is 13533, with only 11.6% matching. I suspect this discrepancy might be due to my use of mistral-7b
for data selection, but it should not result in such a significant difference. I'm wondering if the processed data was selected using Llama2-7b
. If so, I will reproduce it again and try to achieve the same result.
Metadata
Metadata
Assignees
Labels
No labels