7 127 55

Rui Zhao

ruizhaocv

https://ruizhaocv.github.io/

AI & ML interests

Multimodal and GenAI

Recent Activity

upvoted a paper 8 days ago

D-AR: Diffusion via Autoregressive Models

upvoted a paper 8 days ago

UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

upvoted a paper 9 days ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

View all activity

Organizations

ruizhaocv's activity

upvoted 2 papers 8 days ago

D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published 8 days ago • 34

UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published 8 days ago • 23

upvoted a paper 9 days ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published 10 days ago • 91

upvoted a paper 10 days ago

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Paper • 2505.18445 • Published 14 days ago • 63

upvoted a paper 11 days ago

RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning

Paper • 2505.17540 • Published 14 days ago • 7

upvoted a paper 18 days ago

Visual Planning: Let's Think Only with Images

Paper • 2505.11409 • Published 21 days ago • 55

upvoted a paper 26 days ago

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published 29 days ago • 78

upvoted a paper about 1 month ago

BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation

Paper • 2504.14538 • Published Apr 20 • 29

upvoted 3 papers about 2 months ago

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 50

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 105

SkyReels-A2: Compose Anything in Video Diffusion Transformers

Paper • 2504.02436 • Published Apr 3 • 37

upvoted a paper 2 months ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73

upvoted a paper 3 months ago

XAttention: Block Sparse Attention with Antidiagonal Scoring

Paper • 2503.16428 • Published Mar 20 • 14

liked a Space 3 months ago

173

Hunyuan3D 2mini Turbo

🔥

Fast Images-to-3D Generation within 1 Second

upvoted 6 papers 3 months ago

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

Paper • 2503.16421 • Published Mar 20 • 10

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published Mar 20 • 41