Bridging Supervised Learning and Reinforcement Learning in Math Reasoning Paper • 2505.18116 • Published 14 days ago • 4
Describe Anything: Detailed Localized Image and Video Captioning Paper • 2504.16072 • Published Apr 22 • 60
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24 • 19
Unified Visual Relationship Detection with Vision and Language Models Paper • 2303.08998 • Published Mar 16, 2023
The iNaturalist Species Classification and Detection Dataset Paper • 1707.06642 • Published Jul 20, 2017
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception Paper • 2305.06324 • Published May 10, 2023 • 1
Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset Paper • 2004.12276 • Published Apr 26, 2020 • 1