Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models Paper • 2508.09138 • Published 8 days ago • 34
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models Paper • 2508.09138 • Published 8 days ago • 34
OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper • 2508.05614 • Published 13 days ago • 18 • 2
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models Paper • 2508.05613 • Published 13 days ago • 16
OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper • 2508.05614 • Published 13 days ago • 18
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency Paper • 2508.05615 • Published 13 days ago • 20
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models Paper • 2508.05613 • Published 13 days ago • 16
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency Paper • 2508.05615 • Published 13 days ago • 20
OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper • 2508.05614 • Published 13 days ago • 18
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 23 days ago • 156
Hierarchical Budget Policy Optimization for Adaptive Reasoning Paper • 2507.15844 • Published about 1 month ago • 16 • 2
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization Paper • 2507.15758 • Published about 1 month ago • 34 • 1
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Paper • 2502.11684 • Published Feb 17 • 2
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Paper • 2502.11684 • Published Feb 17 • 2
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation Paper • 2506.03139 • Published Jun 3 • 16
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization Paper • 2402.17574 • Published Feb 27, 2024
GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding Paper • 2507.15846 • Published about 1 month ago • 130