hdong0/Qwen2.5-Math-1.5B-Open-R1-GRPO_deepscaler_100steps_lr1e-6_kl1e-3_acc Text Generation • Updated 9 days ago • 10
hdong0/Qwen2.5-Math-1.5B-batch-mix-Open-R1-GRPO_deepscaler_100steps_lr1e-6_kl1e-3_acc Text Generation • Updated 8 days ago • 11
hdong0/Qwen2.5-Math-1.5B-Open-R1-GRPO_deepscaler_1000steps_lr1e-6_kl1e-3_acc Text Generation • Updated 9 days ago • 42
hdong0/deepseek-Qwen2.5-Math-1.5B-Open-R1-GRPO_deepscaler_1000steps_lr1e-6_acc Text Generation • Updated 7 days ago • 37