Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 β’ 7 items β’ Updated 17 days ago β’ 141
Table-R1: Inference-Time Scaling for Table Reasoning Paper β’ 2505.23621 β’ Published 9 days ago β’ 88
SmolVLA Collection Small, efficient and light-weight VLAs pretrained on community datasets β’ 1 item β’ Updated 6 days ago β’ 19
view article Article *Context Is Gold to Find the Gold Passage*: Evaluating and Training Contextual Document Embeddings By manu and 1 other β’ 5 days ago β’ 23
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space Paper β’ 2505.15778 β’ Published 17 days ago β’ 15
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection Paper β’ 2505.20289 β’ Published 12 days ago β’ 10
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization Paper β’ 2505.19000 β’ Published 13 days ago β’ 42
RL with KL penalties is better viewed as Bayesian inference Paper β’ 2205.11275 β’ Published May 23, 2022 β’ 1
One-RL-to-See-Them-All Collection https://github.com/MiniMax-AI/One-RL-to-See-Them-All β’ 5 items β’ Updated 12 days ago β’ 12
One RL to See Them All: Visual Triple Unified Reinforcement Learning Paper β’ 2505.18129 β’ Published 15 days ago β’ 59
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! Paper β’ 2505.15656 β’ Published 17 days ago β’ 14
view article Article Tiny Agents in Python: a MCP-powered agent in ~70 lines of code By celinah and 3 others β’ 15 days ago β’ 122
MMaDA: Multimodal Large Diffusion Language Models Paper β’ 2505.15809 β’ Published 17 days ago β’ 85