Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.04034

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Paper • 2504.18904 • Published Apr 26 • 9
ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

Paper • 2503.21860 • Published Mar 27 • 5
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

Paper • 2502.19459 • Published Feb 26 • 11
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Paper • 2412.11457 • Published Dec 16, 2024 • 6

Task-oriented Sequential Grounding in 3D Scenes

Paper • 2408.04034 • Published Aug 7, 2024 • 8

GECO: Generative Image-to-3D within a SECOnd

Paper • 2405.20327 • Published May 30, 2024 • 11
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

Paper • 2406.03184 • Published Jun 5, 2024 • 22
NPGA: Neural Parametric Gaussian Avatars

Paper • 2405.19331 • Published May 29, 2024 • 10
Unified Text-to-Image Generation and Retrieval

Paper • 2406.05814 • Published Jun 9, 2024 • 16

about 16 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 35
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 28
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 127
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 23

Next-Gen Robotics

Collection for myself to compile everything I thing is or will be related to Robotics

Achieving Human Level Competitive Robot Table Tennis

Paper • 2408.03906 • Published Aug 7, 2024 • 28
openbmb/MiniCPM-V-2_6

Image-Text-to-Text • 8B • Updated Jun 13 • 79.6k • 995
apple/OpenELM-270M-Instruct

Text Generation • 0.3B • Updated Feb 28 • 1.85k • 138
google/gemma-2-2b

Text Generation • 3B • Updated Aug 7, 2024 • 928k • 579

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 31
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Paper • 2312.17172 • Published Dec 28, 2023 • 29
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

Paper • 2401.01974 • Published Jan 3, 2024 • 7
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 29

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Paper • 2504.18904 • Published Apr 26 • 9
ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

Paper • 2503.21860 • Published Mar 27 • 5
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

Paper • 2502.19459 • Published Feb 26 • 11
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Paper • 2412.11457 • Published Dec 16, 2024 • 6

about 16 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 35
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 28
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 127
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 23

Task-oriented Sequential Grounding in 3D Scenes

Paper • 2408.04034 • Published Aug 7, 2024 • 8

Next-Gen Robotics

Collection for myself to compile everything I thing is or will be related to Robotics

Achieving Human Level Competitive Robot Table Tennis

Paper • 2408.03906 • Published Aug 7, 2024 • 28
openbmb/MiniCPM-V-2_6

Image-Text-to-Text • 8B • Updated Jun 13 • 79.6k • 995
apple/OpenELM-270M-Instruct

Text Generation • 0.3B • Updated Feb 28 • 1.85k • 138
google/gemma-2-2b

Text Generation • 3B • Updated Aug 7, 2024 • 928k • 579

GECO: Generative Image-to-3D within a SECOnd

Paper • 2405.20327 • Published May 30, 2024 • 11
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

Paper • 2406.03184 • Published Jun 5, 2024 • 22
NPGA: Neural Parametric Gaussian Avatars

Paper • 2405.19331 • Published May 29, 2024 • 10
Unified Text-to-Image Generation and Retrieval

Paper • 2406.05814 • Published Jun 9, 2024 • 16

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 31
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Paper • 2312.17172 • Published Dec 28, 2023 • 29
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

Paper • 2401.01974 • Published Jan 3, 2024 • 7
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 29

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs