Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published May 5 • 86
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing Paper • 2505.02823 • Published May 5 • 5
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published Apr 29 • 44
Improving Editability in Image Generation with Layer-wise Memory Paper • 2505.01079 • Published May 2 • 29
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution Paper • 2505.00497 • Published May 1 • 17
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions Paper • 2504.19056 • Published Apr 27 • 18
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction Paper • 2504.21855 • Published Apr 30 • 13