-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 122 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 5
Collections
Discover the best community collections!
Collections including paper arxiv:2503.23077
-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 51 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 51
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
Reasoning Language Models: A Blueprint
Paper • 2501.11223 • Published • 33 -
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
Paper • 2501.09775 • Published • 34 -
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 41
-
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 38 -
Efficient Inference for Large Reasoning Models: A Survey
Paper • 2503.23077 • Published • 46 -
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Paper • 2503.20533 • Published • 12