mangoxb
's Collections
One-Minute Video Generation with Test-Time Training
Paper
•
2504.05298
•
Published
•
105
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper
•
2503.23307
•
Published
•
134
Towards Understanding Camera Motions in Any Video
Paper
•
2504.15376
•
Published
•
157
Antidistillation Sampling
Paper
•
2504.13146
•
Published
•
61
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through
Task Tokenization
Paper
•
2503.19901
•
Published
•
41
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation
with Hybrid Guidance
Paper
•
2504.01724
•
Published
•
67
Long Video Diffusion Generation with Segmented Cross-Attention and
Content-Rich Video Data Curation
Paper
•
2412.01316
•
Published
•
9
STIV: Scalable Text and Image Conditioned Video Generation
Paper
•
2412.07730
•
Published
•
75
VidGen-1M: A Large-Scale Dataset for Text-to-video Generation
Paper
•
2408.02629
•
Published
•
15
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video
Generation
Paper
•
2503.01739
•
Published
•
8
Video-T1: Test-Time Scaling for Video Generation
Paper
•
2503.18942
•
Published
•
88
VideoGuide: Improving Video Diffusion Models without Training Through a
Teacher's Guide
Paper
•
2410.04364
•
Published
•
30
Improving Video Generation with Human Feedback
Paper
•
2501.13918
•
Published
•
50
Training-free Long Video Generation with Chain of Diffusion Model
Experts
Paper
•
2408.13423
•
Published
•
24
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion
Generation in Video Models
Paper
•
2502.02492
•
Published
•
65
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human
Animation Models
Paper
•
2502.01061
•
Published
•
216
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising
Steps
Paper
•
2501.09732
•
Published
•
72
LTX-Video: Realtime Video Latent Diffusion
Paper
•
2501.00103
•
Published
•
47
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
•
2412.05271
•
Published
•
158
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with
Video LLM
Paper
•
2501.00599
•
Published
•
48
Shifting AI Efficiency From Model-Centric to Data-Centric Compression
Paper
•
2505.19147
•
Published
•
143
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong
Pretraining Data Selection
Paper
•
2505.07293
•
Published
•
26
Alchemist: Turning Public Text-to-Image Data into Generative Gold
Paper
•
2505.19297
•
Published
•
73
Predictive Data Selection: The Data That Predicts Is the Data That
Teaches
Paper
•
2503.00808
•
Published
•
57
R&B: Domain Regrouping and Data Mixture Balancing for Efficient
Foundation Model Training
Paper
•
2505.00358
•
Published
•
24
ICon: In-Context Contribution for Automatic Data Selection
Paper
•
2505.05327
•
Published
•
11
SWE-smith: Scaling Data for Software Engineering Agents
Paper
•
2504.21798
•
Published
•
10
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement
Learning
Paper
•
2505.24871
•
Published
•
18
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
•
2409.17115
•
Published
•
63