liumy2010
/

Qwen2.5-3B-countdown-SFT

Text Generation

text-generation-inference

Model card Files Files and versions Community

UFT

This repository contains the model presented in UFT: Unifying Supervised and Reinforcement Fine-Tuning.

Code: https://github.com/liumy2010/UFT

## References

* [UFT: Unifying Supervised and Reinforcement Fine-Tuning](https://arxiv.org/abs/2505.16984)

Downloads last month: 5

Safetensors

Model size

3.4B params

Tensor type

F32

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for liumy2010/Qwen2.5-3B-countdown-SFT

Base model

Qwen/Qwen2.5-3B

Finetuned

(169)

this model

Collection including liumy2010/Qwen2.5-3B-countdown-SFT

UFT

UFT: Unifying Supervised and Reinforcement Fine-Tuning • 80 items • Updated 10 days ago • 1