SafeWork-R1: Coevolving Safety and Intelligence under the AI-45^{circ} Law Paper • 2507.18576 • Published 27 days ago • 4
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization Paper • 2504.05812 • Published Apr 8 • 3
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 128
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28 • 51
Sherlock Collection Series model of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models" • 5 items • Updated May 29 • 3
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published Dec 27, 2024 • 88