The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason Paper • 2505.22653 • Published 10 days ago • 64
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 15 days ago • 86
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published 16 days ago • 55
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published 18 days ago • 60
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published Jan 22 • 63
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 238