UFT: Unifying Supervised and Reinforcement Fine-Tuning
Mingyang Liu PRO
liumy2010
AI & ML interests
None yet
Recent Activity
updated
a model
8 days ago
liumy2010/Qwen2.5-3B-math-UFT
updated
a model
8 days ago
liumy2010/Qwen2.5-3B-math-SFT-RFT
updated
a model
8 days ago
liumy2010/Qwen2.5-3B-math-SFT
Organizations
None yet
Collections
1
Papers
1
models
75
liumy2010/Qwen2.5-3B-math-UFT
Text Generation
•
Updated
•
3
liumy2010/Qwen2.5-3B-math-SFT-RFT
Text Generation
•
Updated
•
8
liumy2010/Qwen2.5-3B-math-SFT
Text Generation
•
Updated
•
3
liumy2010/Qwen2.5-3B-math-RFT
Text Generation
•
Updated
•
6
liumy2010/Qwen2.5-3B-math-R3
Text Generation
•
Updated
•
7
liumy2010/Qwen2.5-3B-kk_logic-UFT
Text Generation
•
Updated
•
6
liumy2010/Qwen2.5-3B-kk_logic-SFT-RFT
Text Generation
•
Updated
•
12
liumy2010/Qwen2.5-3B-kk_logic-SFT
Text Generation
•
Updated
•
3
liumy2010/Qwen2.5-3B-kk_logic-RFT
Text Generation
•
Updated
•
6
liumy2010/Qwen2.5-3B-kk_logic-R3
Text Generation
•
Updated
•
6