Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.10667

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 196
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior

Paper • 2312.01841 • Published Dec 4, 2023 • 1
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Paper • 2311.16498 • Published Nov 27, 2023 • 1
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

Paper • 2312.02134 • Published Dec 4, 2023 • 2

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention

Paper • 2409.01876 • Published Sep 3, 2024 • 2
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 29
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 26

image to text / captioner

sdasd112132/Vision-8B-MiniCPM-2_5-Uncensored-and-Detailed-4bit

Visual Question Answering • 5B • Updated Jun 1, 2024 • 15 • 32
Running

102

102

Idefics3

📊

Generate text based on an image and prompt
Runtime error

37

37

Vilt Vqa

🌍

Ask questions about images and get answers
vikhyatk/moondream2

Image-Text-to-Text • 2B • Updated Jul 7 • 141k • 1.25k

facebook/detr-resnet-50

Object Detection • 0.0B • Updated Apr 10, 2024 • 320k • • 886
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
TencentARC/PhotoMaker

Text-to-Image • Updated Jul 22, 2024 • 10.7k • 433

conversational video

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 29

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4, 2024 • 98

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20, 2024 • 27
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 28

Visual_Question_answering

keyurhirpara/idefics_9b_vqa_model

Updated Mar 9, 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 196
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior

Paper • 2312.01841 • Published Dec 4, 2023 • 1
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Paper • 2311.16498 • Published Nov 27, 2023 • 1
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

Paper • 2312.02134 • Published Dec 4, 2023 • 2

conversational video

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 29

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4, 2024 • 98

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention

Paper • 2409.01876 • Published Sep 3, 2024 • 2
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 29
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 26

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20

image to text / captioner

sdasd112132/Vision-8B-MiniCPM-2_5-Uncensored-and-Detailed-4bit

Visual Question Answering • 5B • Updated Jun 1, 2024 • 15 • 32
Running

102

102

Idefics3

📊

Generate text based on an image and prompt
Runtime error

37

37

Vilt Vqa

🌍

Ask questions about images and get answers
vikhyatk/moondream2

Image-Text-to-Text • 2B • Updated Jul 7 • 141k • 1.25k

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20, 2024 • 27
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 28

facebook/detr-resnet-50

Object Detection • 0.0B • Updated Apr 10, 2024 • 320k • • 886
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20
TencentARC/PhotoMaker

Text-to-Image • Updated Jul 22, 2024 • 10.7k • 433

Visual_Question_answering

keyurhirpara/idefics_9b_vqa_model

Updated Mar 9, 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 20

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs