Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
liumy2010
's Collections
UFT
UFT
updated
May 28
UFT: Unifying Supervised and Reinforcement Fine-Tuning
Upvote
1
UFT: Unifying Supervised and Reinforcement Fine-Tuning
Paper
•
2505.16984
•
Published
May 22
•
3
liumy2010/Llama-3.2-1B-countdown-R3
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-countdown-RFT
Text Generation
•
1B
•
Updated
May 30
•
6
liumy2010/Llama-3.2-1B-countdown-SFT
Text Generation
•
1B
•
Updated
May 30
•
5
liumy2010/Llama-3.2-1B-countdown-SFT-RFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-countdown-UFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-kk_logic-R3
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-kk_logic-RFT
Text Generation
•
1B
•
Updated
May 30
•
3
liumy2010/Llama-3.2-1B-kk_logic-SFT
Text Generation
•
1B
•
Updated
May 30
•
5
liumy2010/Llama-3.2-1B-kk_logic-SFT-RFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-kk_logic-UFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-math-R3
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-math-RFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-math-SFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-math-SFT-RFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-1B-math-UFT
Text Generation
•
1B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-countdown-R3
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-countdown-RFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-countdown-SFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-countdown-SFT-RFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-countdown-UFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-kk_logic-R3
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-kk_logic-RFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-kk_logic-SFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-kk_logic-SFT-RFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-kk_logic-UFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-math-R3
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-math-RFT
Text Generation
•
4B
•
Updated
May 30
•
15
liumy2010/Llama-3.2-3B-math-SFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-math-SFT-RFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Llama-3.2-3B-math-UFT
Text Generation
•
4B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-countdown-R3
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-countdown-RFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-countdown-SFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-countdown-SFT-RFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-countdown-UFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-kk_logic-R3
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-kk_logic-RFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-kk_logic-SFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-kk_logic-SFT-RFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-kk_logic-UFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-math-R3
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-math-RFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-math-SFT
Text Generation
•
0.6B
•
Updated
May 30
•
3
liumy2010/Qwen2.5-0.5B-math-SFT-RFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-0.5B-math-UFT
Text Generation
•
0.6B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-countdown-R3
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-countdown-RFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-countdown-SFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-countdown-SFT-RFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-countdown-UFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-kk_logic-R3
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-kk_logic-RFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-kk_logic-SFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-kk_logic-SFT-RFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-kk_logic-UFT
Text Generation
•
2B
•
Updated
May 30
•
3
liumy2010/Qwen2.5-1.5B-math-R3
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-math-RFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-math-SFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-math-SFT-RFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-1.5B-math-UFT
Text Generation
•
2B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-countdown-R3
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-countdown-RFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-countdown-SFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-countdown-SFT-RFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-countdown-UFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-kk_logic-R3
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-kk_logic-RFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-kk_logic-SFT
Text Generation
•
3B
•
Updated
May 30
•
8
liumy2010/Qwen2.5-3B-kk_logic-SFT-RFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-kk_logic-UFT
Text Generation
•
3B
•
Updated
May 30
•
3
liumy2010/Qwen2.5-3B-math-R3
Text Generation
•
3B
•
Updated
May 30
•
3
liumy2010/Qwen2.5-3B-math-RFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-math-SFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-math-SFT-RFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/Qwen2.5-3B-math-UFT
Text Generation
•
3B
•
Updated
May 30
•
4
liumy2010/UFT-Countdown
Viewer
•
Updated
May 28
•
11.3k
•
9
liumy2010/UFT-MATH_3_4_5
Viewer
•
Updated
May 28
•
9.26k
•
9
liumy2010/UFT-Logic
Viewer
•
Updated
May 28
•
5k
•
2
liumy2010/UFT-Other_Evaluation_Datasets
Viewer
•
Updated
May 28
•
4.93k
•
1
Upvote
1
Share collection
View history
Collection guide
Browse collections