BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack Paper • 2406.10149 • Published Jun 14, 2024 • 53
ATLAS: Learning to Optimally Memorize the Context at Test Time Paper • 2505.23735 • Published 8 days ago • 22
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published 4 days ago • 127
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published 7 days ago • 86
view article Article System Prompt Learning: Teaching LLMs to Learn Problem-Solving Strategies from Experience By codelion • 4 days ago • 10
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published 14 days ago • 77
view article Article AutoThink: Adaptive Reasoning for Large Language Models By codelion • 10 days ago • 4
view article Article OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve By codelion • 17 days ago • 19
view article Article Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training By codelion • 20 days ago • 5
Pivotal Token Search Collection Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success • 9 items • Updated 24 days ago • 3
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 30 days ago • 64
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 168
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Paper • 2504.21233 • Published Apr 30 • 45
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published Apr 30 • 56