OpsEval / data_v2 /lenovo_zh_mc_gen.csv
Junetheriver's picture
update leaderboard 2024-09-06
fe35dbb
raw
history blame
640 Bytes
name,zero_self_con,zero_cot_self_con,few_self_con,few_cot_self_con
Baichuan2-13B-Chat,60.0,67.5,60.0,67.5
Chatglm3-6B,60.0,60.0,55.0,60.0
Devops-Model-14B-Chat,67.5,57.5,70.0,70.0
Ernie-Bot-4.0,75.0,77.5,75.0,82.5
Gpt-3.5-Turbo,62.5,70.0,57.5,62.5
GPT-4,77.5,82.5,77.5,82.5
Llama-2-13B,45.0,62.5,60.0,55.0
Llama-2-70B-Chat,22.5,75.0,20.0,57.5
Llama-2-7B,32.5,45.0,60.0,55.0
Mistral-7B,47.5,62.5,35.0,60.0
Qwen-14B-Chat,67.5,67.5,65.0,67.5
Qwen-72B-Chat,72.5,75.0,75.0,75.0
Yi-34B-Chat,75.0,82.5,57.5,52.5
Claude-3-Opus,71.42857142857143,,,
Meta-Llama-3-8B-Instruct,47.14285714285714,44.285714285714285,45.714285714285715,32.857142857142854