Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published 8 days ago • 169
VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments Paper • 2506.02387 • Published 4 days ago • 56
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published 4 days ago • 56
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis Paper • 2506.02096 • Published 5 days ago • 50
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation Paper • 2505.24714 • Published 8 days ago • 35
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs Paper • 2505.24120 • Published 8 days ago • 48
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models Paper • 2506.03135 • Published 4 days ago • 34
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation Paper • 2506.02397 • Published 4 days ago • 34
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper • 2506.03143 • Published 4 days ago • 38
MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs Paper • 2506.01674 • Published 5 days ago • 25
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics Paper • 2506.00070 • Published 9 days ago • 26
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published 8 days ago • 114
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published 8 days ago • 89
Time Blindness: Why Video-Language Models Can't See What Humans Can? Paper • 2505.24867 • Published 8 days ago • 73
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published 8 days ago • 42
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation Paper • 2505.18842 • Published 14 days ago • 36
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization Paper • 2505.24862 • Published 8 days ago • 31
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models Paper • 2505.24025 • Published 8 days ago • 24