Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published 9 days ago • 39
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL Paper • 2508.07976 • Published 9 days ago • 45
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning Paper • 2508.10433 • Published 6 days ago • 140
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models Paper • 2508.10751 • Published 6 days ago • 23
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published 28 days ago • 36
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17 • 55
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling Paper • 2507.07955 • Published Jul 10 • 23
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 73
Ark: An Open-source Python-based Framework for Robot Learning Paper • 2506.21628 • Published Jun 24 • 15
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13 • 24
Seedance 1.0: Exploring the Boundaries of Video Generation Models Paper • 2506.09113 • Published Jun 10 • 101
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks Paper • 2506.10954 • Published Jun 12 • 51
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published Jun 11 • 98
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5 • 129