Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example"
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 94 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1
Text Generation • Updated • 353 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi13
Text Generation • Updated • 422 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1_pi13
Text Generation • Updated • 57