Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published 4 days ago • 127
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published 8 days ago • 41
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper • 2505.20411 • Published 11 days ago • 84
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 9 days ago • 116
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published 10 days ago • 91
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 14 days ago • 85
One RL to See Them All: Visual Triple Unified Reinforcement Learning Paper • 2505.18129 • Published 14 days ago • 59
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published 16 days ago • 39
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published 16 days ago • 51
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published 17 days ago • 22
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published 17 days ago • 129
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection Paper • 2505.07293 • Published 25 days ago • 26
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published 29 days ago • 78
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published 29 days ago • 78