Submitted by Benjamin-eecs 48 SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning · 12 authors 136 3
Submitted by jianzongwu 31 VMoBA: Mixture-of-Block Attention for Video Diffusion Models · 8 authors 38 1
Submitted by a-yakovenko 24 MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation · 4 authors 57 2
Submitted by alexgambashidze 23 Listener-Rewarded Thinking in VLMs for Image Preferences · 8 authors 1
Submitted by Jianyu 19 Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective · 3 authors 2
Submitted by Skhaki 17 SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity · 10 authors 51 2
Submitted by wanhaoliu 14 Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention · 4 authors 2
Submitted by mdmoor 12 MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning · 4 authors 6 4
Submitted by najoungkim 11 RExBench: Can coding agents autonomously implement AI research extensions? · 7 authors 4 1
Submitted by Mingyuan1997 11 Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? · 8 authors 1
Submitted by liuhuadai 8 ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing · 7 authors 971 2
Submitted by JJ-TMT 7 UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding · 5 authors 39 1
Submitted by jmprcp 4 Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs · 7 authors 2
Submitted by RaghavvGoel 3 VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs · 12 authors 1
Submitted by XiaoyunYuan 2 Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography · 5 authors 5 1