Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5 • 19
360Zhinao2 Collection 360Zhinao2 language model, include both base and chat model • 7 items • Updated Mar 5 • 1