Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Augusteinia 's Collections
Paradigm
Math
VLM
3DV
RL thinking

VLM

updated Jun 26
Upvote
1

  • BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

    Paper • 2505.09568 • Published May 14 • 97

  • Qwen3 Technical Report

    Paper • 2505.09388 • Published May 14 • 276

  • GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

    Paper • 2505.11049 • Published May 16 • 61

  • Emerging Properties in Unified Multimodal Pretraining

    Paper • 2505.14683 • Published May 20 • 133

  • MMaDA: Multimodal Large Diffusion Language Models

    Paper • 2505.15809 • Published May 21 • 95

  • One RL to See Them All: Visual Triple Unified Reinforcement Learning

    Paper • 2505.18129 • Published May 23 • 60

  • Video World Models with Long-term Spatial Memory

    Paper • 2506.05284 • Published Jun 5 • 53

  • SpatialLM: Training Large Language Models for Structured Indoor Modeling

    Paper • 2506.07491 • Published Jun 9 • 49

  • Sekai: A Video Dataset towards World Exploration

    Paper • 2506.15675 • Published Jun 18 • 64
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs