ArmBench-LLM / unified_exam_results.csv
daniel7an
updates
0d208c0
raw
history blame
438 Bytes
Model,Armenian language and literature,Armenian history,Mathematics,Average
claude-3-7-sonnet-20250219,10.5,7.75,15.0,11.08
claude-3-5-sonnet-20241022,10.0,9.25,12.75,10.67
gemini-2.0-flash,5.5,6.75,17.25,9.83
gpt-4o,6.75,6.75,13.25,8.92
qwen-max-2025-01-25,7.25,4.5,14.25,8.67
gemini-1.5-flash,4.75,3.75,15.0,7.83
DeepSeek-V3,5.25,5.0,12.25,7.5
Meta-Llama-3.3-70B-Instruct,4.5,5.25,11.5,7.08
claude-3-5-haiku-20241022,5.0,3.75,10.75,6.5