AI & ML interests

None defined yet.

Recent Activity

main-figure Orsta is a family of high-performance Vision-Language Models (7B–32B) trained on 8 diverse tasks including detection, grounding, math, and visual puzzles. These models are optimized for both visual perception and reasoning. They are trained using the V-Triune framework, a unified RL system that streamlines multi-task learning across vision-language domains. Orsta delivers strong performance, with improvements ranging from +2.1 to an impressive +14.1 across its various 7B and 32B model variants, with performance benefits extending to a wide range of downstream tasks.

Explore our models, tasks, and results in the technical report.