Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper • 2505.14604 • Published 17 days ago • 23
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios Paper • 2505.16944 • Published 15 days ago • 8
Training Step-Level Reasoning Verifiers with Formal Verification Tools Paper • 2505.15960 • Published 16 days ago • 7
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning Paper • 2505.15134 • Published 17 days ago • 6
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published 17 days ago • 22
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization Paper • 2505.13430 • Published 18 days ago • 10
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training Paper • 2505.14681 • Published 17 days ago • 9
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations Paper • 2505.18125 • Published 14 days ago • 109
QwenLong-CPRS: Towards infty-LLMs with Dynamic Context Optimization Paper • 2505.18092 • Published 14 days ago • 42
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning Paper • 2505.16022 • Published 16 days ago • 3
Interleaved Reasoning for Large Language Models via Reinforcement Learning Paper • 2505.19640 • Published 11 days ago • 12
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective Paper • 2505.17652 • Published 14 days ago • 6
UFT: Unifying Supervised and Reinforcement Fine-Tuning Paper • 2505.16984 • Published 15 days ago • 3
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs Paper • 2505.19075 • Published 12 days ago • 21
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment Paper • 2505.11821 • Published 21 days ago • 13
Text2Grad: Reinforcement Learning from Natural Language Feedback Paper • 2505.22338 • Published 9 days ago • 6
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 9 days ago • 116
Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem Paper • 2506.03295 • Published 3 days ago • 16