Skip to content

Latest commit

 

History

History
73 lines (61 loc) · 4.88 KB

all_results.md

File metadata and controls

73 lines (61 loc) · 4.88 KB

alpaca-eval-ko_helpful

model score-mean score-std score-count
GT 2.55832 0.947323 797
maywell/Synatra-Yi-Ko-6B 2.35552 0.884752 797
42dot/42dot_LLM-SFT-1.3B 2.25235 0.933403 797
gemma-7b-it 2.20721 0.874811 797
gemma-2b-it 2.10127 0.991608 797
KT-AI/midm-bitext-S-7B-inst-v1 1.91185 1.05215 797
nlpai-lab/kullm-polyglot-12.8b-v3 1.80249 0.920562 797
nlpai-lab/kullm-polyglot-5.8b-v2 1.59781 1.13544 797
beomi/KoAlpaca-Polyglot-5.8B 1.12279 0.829334 797
kfkas/Llama-2-ko-7b-Chat 0.635635 0.750805 8

gpt4evol_helpful

model score-mean score-std score-count
GT 3.20818 0.865725 100
maywell/Synatra-Yi-Ko-6B 2.40428 0.782667 100
heegyu/42dot-SFT-DPO-v0.1-steps-63332 2.23072 0.801808 100
heegyu/42dot-SFT-DPO-v0.1-steps-189996 2.2285 0.706424 100
42dot/42dot_LLM-SFT-1.3B 2.16817 0.700063 200
heegyu/42dot-SFT-DPO-v0.1-steps-253328 2.15263 0.76198 100
heegyu/42dot-SFT-DPO-v0.1-steps-126664 2.13087 0.825525 100
KT-AI/midm-bitext-S-7B-inst-v1 1.99792 0.773687 100
nlpai-lab/kullm-polyglot-5.8b-v2 1.90725 0.734876 100
heegyu/Yi-ko-6B-OKI-v20231124-2e-5-epoch-1 1.75422 0.76077 100
nlpai-lab/kullm-polyglot-12.8b-v3 1.67046 0.756581 100
beomi/KoAlpaca-Polyglot-12.8B 1.28939 0.641955 100

ko-ethical-questions_helpful

model score-mean score-std score-count
nlpai-lab/kullm-polyglot-5.8b-v2 3.20663 1.37428 100
beomi/KoAlpaca-Polyglot-12.8B 3.06111 1.0764 100
42dot/42dot_LLM-SFT-1.3B 2.54959 1.36998 100
KT-AI/midm-bitext-S-7B-inst-v1 2.54674 1.67282 100

ko-ethical-questions_safety

model score-mean score-std score-count
GT 0.841568 0.997286 100
heegyu/Yi-ko-6B-OKI-v20231124-2e-5-epoch-1 -0.147002 1.51299 100
maywell/Synatra-Yi-Ko-6B -0.233675 1.48525 100
heegyu/42dot-SFT-DPO-v0.1-steps-126664 -0.747935 1.76688 100
gemma-7b-it -0.827084 1.77708 100
42dot/42dot_LLM-SFT-1.3B -0.832895 1.71233 100
heegyu/42dot-SFT-DPO-v0.1-steps-253328 -0.841021 1.70876 100
heegyu/42dot-SFT-DPO-v0.1-steps-189996 -0.862425 1.93667 100
heegyu/42dot-SFT-DPO-v0.1-steps-63332 -1.04098 1.97453 100
KT-AI/midm-bitext-S-7B-inst-v1 -1.08714 1.93805 100
gemma-2b-it -1.10114 1.71639 100
nlpai-lab/kullm-polyglot-12.8b-v3 -1.28393 2.03808 100
nlpai-lab/kullm-polyglot-5.8b-v2 -1.44178 2.03098 100
beomi/KoAlpaca-Polyglot-12.8B -1.71236 2.03657 100

pku-saferlhf-ko_helpful

model score-mean score-std score-count
42dot/42dot_LLM-SFT-1.3B 2.4871 1.6729 100

pku-saferlhf-ko_safety

model score-mean score-std score-count
42dot/42dot_LLM-SFT-1.3B -0.249942 2.2079 100
heegyu/42dot-SFT-DPO-v0.1-steps-126664 -0.770537 2.58294 100