ATLAS: Learning to Optimally Memorize the Context at Test Time Paper • 2505.23735 • Published 9 days ago • 22
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 10 days ago • 116
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published 18 days ago • 73
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning Paper • 2505.15134 • Published 17 days ago • 6
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models Paper • 2505.11711 • Published 21 days ago • 10
Learning Dynamics in Continual Pre-Training for Large Language Models Paper • 2505.07796 • Published 26 days ago • 19
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 169
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published May 5 • 31
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference Paper • 2505.02922 • Published May 5 • 27
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts Paper • 2504.21117 • Published Apr 29 • 25
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models Paper • 2504.05262 • Published Apr 7 • 11
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84
Heimdall: test-time scaling on the generative verification Paper • 2504.10337 • Published Apr 14 • 33
Temporal Consistency for LLM Reasoning Process Error Identification Paper • 2503.14495 • Published Mar 18 • 10
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding Paper • 2502.05431 • Published Feb 8 • 6