OpsEval / data_v2 /zte_zh_mc_gen.csv
Junetheriver's picture
update leaderboard 2024-09-06
fe35dbb
raw
history blame
1.12 kB
name,zero_self_con,zero_cot_self_con,few_self_con,few_cot_self_con
Baichuan-13B-Chat,11.13,28.61,13.22,33.97
Chatglm2-6B,23.12,24.08,30.46,35.9
Chatglm3-6B,32.6,35.4,28.3,40.9
Chinese-Alpaca-2-13B,22.69,24.59,40.52,40.73
Chinese-Llama-2-13B,17.98,17.83,31.66,36.24
Devops-Model-14B-Chat,42.7,53.57,57.25,54.29
Ernie-Bot-4.0,45.99,48.98,46.0,54.0
Glm3-Turbo,43.0,,,
Glm4,50.0,,,
Gpt-3.5-Turbo,36.83,39.25,39.77,42.15
Gpt-4,,62.11,,65.68
Internlm-7B,27.81,19.95,24.18,35.35
Internlm2-Chat-20B,44.6,47.0,62.2,38.3
Internlm2-Chat-7B,38.8,44.6,46.0,35.8
Llama-2-13B,27.16,29.99,36.15,39.02
Llama-2-70B-Chat,24.38,43.63,44.65,48.84
Llama-2-7B,23.47,29.26,30.03,31.93
Mistral-7B,1.27,42.05,30.72,46.44
Qwen-14B-Chat,41.44,47.98,49.92,58.85
Qwen-72B-Chat,64.79,65.72,70.19,68.38
Qwen-7B-Chat,36.5,33.51,40.59,31.46
Yi-34B-Chat,64.58,65.51,70.92,47.97
Claude-3-Opus,51.4,,,
gemma_2b,25.6,28.3,19.1,35.5
gemma_7b,27.3,35.4,17.3,44.5
Meta-Llama-3-70B-Instruct,31.1,37.4,51.10000000000001,36.900000000000006
Meta-Llama-3-8B-Instruct,31.1,34.3,36.0,37.1
Qwen1.5-14B-Base,49.1,49.9,62.5,41.3
Qwen1.5-14B-Chat,38.9,50.5,55.2,52.7