SebastianBodza/Kartoffel_Orpheus-3B_german_natural-v0.1 Text-to-Speech • Updated 20 days ago • 1.11k • 9
view post Post 2921 this paper has been blowing upthey train an open-source multimodal LLM (InternVL3) that can compete with GPT-4o and Claude 3.5 Sonnet by:> training text and vision on a single stage> a novel V2PE positional encoding> SFT & mixed preference optimizationPaper: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models (2504.10479)> test-time scaling See translation ❤️ 6 6 👍 2 2 🔥 2 2 👀 1 1 + Reply