PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models Paper • 2506.16054 • Published Jun 19 • 60
VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments Paper • 2506.02387 • Published Jun 3 • 57
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching Paper • 2412.17153 • Published Dec 22, 2024 • 40
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding Paper • 2307.15337 • Published Jul 28, 2023 • 38
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation Paper • 2406.02540 • Published Jun 4, 2024 • 3
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization Paper • 2405.17873 • Published May 28, 2024 • 3
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27 • 71
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization Paper • 2405.17873 • Published May 28, 2024 • 3
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation Paper • 2406.02540 • Published Jun 4, 2024 • 3
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling Paper • 2412.14170 • Published Dec 18, 2024