nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 Text Generation β’ 50B β’ Updated 21 days ago β’ 18.4k β’ 178
nvidia/Llama-3.1-Nemotron-8B-UltraLong-4M-Instruct Text Generation β’ 8B β’ Updated Apr 17 β’ 25.2k β’ 117
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B Text Generation β’ 8B β’ Updated Feb 24 β’ 1.69M β’ β’ 696
Running 3.08k 3.08k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation β’ 2B β’ Updated Feb 24 β’ 712k β’ β’ 1.31k