VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models Paper • 2505.23656 • Published 8 days ago • 23
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs Paper • 2505.24120 • Published 8 days ago • 47
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers Paper • 2505.21541 • Published 13 days ago • 7
From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval Paper • 2505.23059 • Published 9 days ago • 13
Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers Paper • 2506.03065 • Published 3 days ago • 27
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation Paper • 2506.02397 • Published 4 days ago • 33
view article Article KV Cache from scratch in nanoVLM By ariG23498 and 4 others • 3 days ago • 54
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models Paper • 2506.01413 • Published 4 days ago • 14
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks Paper • 2506.00411 • Published 7 days ago • 27
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces Paper • 2506.00123 • Published 7 days ago • 31
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis Paper • 2506.02096 • Published 4 days ago • 49
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models Paper • 2505.21523 • Published 15 days ago • 14
Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature Paper • 2408.15836 • Published Aug 28, 2024 • 14
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published Apr 27 • 16
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference Paper • 2505.02922 • Published May 5 • 27
SIFT: Grounding LLM Reasoning in Contexts via Stickers Paper • 2502.14922 • Published Feb 19 • 32
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5, 2024 • 49
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 52