RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems Paper • 2508.01415 • Published 18 days ago • 7
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published 29 days ago • 37
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • Jun 3 • 231
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete Paper • 2502.21257 • Published Feb 28 • 2
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 280
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 301
CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments Paper • 2503.00729 • Published Mar 2 • 3
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published Mar 3 • 88
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References Paper • 2502.09614 • Published Feb 13 • 11
STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning Paper • 2502.10177 • Published Feb 14 • 6
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 30 days ago • 528
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making Paper • 2409.16686 • Published Sep 25, 2024 • 10