Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published Mar 12 • 31
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published 7 days ago • 87
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 14 days ago • 85
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published 7 days ago • 112
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Paper • 2505.24298 • Published 8 days ago • 20
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning Paper • 2505.20355 • Published 12 days ago • 36
Interleaved Reasoning for Large Language Models via Reinforcement Learning Paper • 2505.19640 • Published 11 days ago • 12
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow Paper • 2505.17399 • Published 15 days ago • 14
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Paper • 2505.19914 • Published 11 days ago • 40
One RL to See Them All: Visual Triple Unified Reinforcement Learning Paper • 2505.18129 • Published 14 days ago • 59
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published 17 days ago • 60
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published 15 days ago • 55
JULI: Jailbreak Large Language Models by Self-Introspection Paper • 2505.11790 • Published 21 days ago
Optimizing Anytime Reasoning via Budget Relative Policy Optimization Paper • 2505.13438 • Published 18 days ago • 35
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models Paper • 2505.12504 • Published 19 days ago • 23