Reward Models Collection Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 6 days ago • 20
Mirage Datasets Collection Mirage family datasets used for training • 2 items • Updated Jun 17 • 1
Mirage Collection Mirage family of small language, multi-modal and reasoning models trained through supervised fine-tuning (SFT) • 11 items • Updated Jul 19 • 1
Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10 • 178
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1 • 49