Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Paper • 2506.04207 • Published 2 days ago • 41
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published 7 days ago • 165
view article Article Meet Your AI Coding Sidekick: Augment Code vs. Cursor (2025 Showdown) By lynn-mikami • 9 days ago • 4
view article Article 🌙 Introducing **Moon**: Storytelling Generator Model By kulia-moon and 1 other • 7 days ago • 6
view article Article *Context Is Gold to Find the Gold Passage*: Evaluating and Training Contextual Document Embeddings By manu and 1 other • 4 days ago • 22
view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 3 days ago • 60
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published 4 days ago • 127
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published 10 days ago • 68
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published 7 days ago • 112
view article Article System Prompt Learning: Teaching LLMs to Learn Problem-Solving Strategies from Experience By codelion • 4 days ago • 10
TradExpert: Revolutionizing Trading with Mixture of Expert LLMs Paper • 2411.00782 • Published Oct 16, 2024 • 2
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published 14 days ago • 63
Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model Paper • 2505.17894 • Published 14 days ago • 214