Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model Paper • 2505.21179 • Published 10 days ago • 8
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors Paper • 2505.24625 • Published 7 days ago • 8
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning Paper • 2506.00338 • Published 7 days ago • 8
Cora: Correspondence-aware image editing using few step diffusion Paper • 2505.23907 • Published 8 days ago • 11
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Paper • 2506.01853 • Published 4 days ago • 27
Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models Paper • 2506.00996 • Published 5 days ago • 34
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL Paper • 2505.24875 • Published 7 days ago • 10
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering Paper • 2505.24417 • Published 7 days ago • 12
UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation Paper • 2505.24521 • Published 7 days ago • 15
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge Paper • 2505.23009 • Published 9 days ago • 17
CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects Paper • 2505.21437 • Published 10 days ago • 20
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization Paper • 2505.24862 • Published 7 days ago • 30
SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model Paper • 2505.22126 • Published 9 days ago • 4
ATI: Any Trajectory Instruction for Controllable Video Generation Paper • 2505.22944 • Published 9 days ago • 7
Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape Paper • 2505.22918 • Published 9 days ago • 7
Differentiable Solver Search for Fast Diffusion Sampling Paper • 2505.21114 • Published 10 days ago • 10
MAGREF: Masked Guidance for Any-Reference Video Generation Paper • 2505.23742 • Published 8 days ago • 9