view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 4 days ago • 60
view article Article LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? By danaaubakirova and 6 others • 27 days ago • 57
DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 8 days ago • 237
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated 17 days ago • 141
view article Article Blazingly fast whisper transcriptions with Inference Endpoints By mfuntowicz and 5 others • 25 days ago • 67
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published May 5 • 82
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 68
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets By mingyuliutw and 4 others • Mar 18 • 41
Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 13 items • Updated May 1 • 154
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Apr 28 • 317
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated Apr 12 • 65
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 160