Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective Paper • 2505.17652 • Published 15 days ago • 6
Not All Correct Answers Are Equal: Why Your Distillation Source Matters Paper • 2505.14464 • Published 18 days ago • 8
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity Paper • 2503.01506 • Published Mar 3 • 9
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems Paper • 2408.16293 • Published Aug 29, 2024 • 27
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16, 2024 • 131