UFT

This repository contains the model presented in UFT: Unifying Supervised and Reinforcement Fine-Tuning.

Code: https://github.com/liumy2010/UFT

## References

* [UFT: Unifying Supervised and Reinforcement Fine-Tuning](https://arxiv.org/abs/2505.16984)
Downloads last month
5
Safetensors
Model size
3.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for liumy2010/Qwen2.5-3B-countdown-SFT

Base model

Qwen/Qwen2.5-3B
Finetuned
(169)
this model

Collection including liumy2010/Qwen2.5-3B-countdown-SFT