DINGO: Constrained Inference for Diffusion LLMs
Abstract
DINGO, a dynamic programming-based decoding strategy, enhances diffusion language models by enforcing structured output constraints, significantly improving performance on symbolic math and JSON generation tasks.
Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering significant potential for improved runtime efficiency. However, existing diffusion models lack the ability to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models that generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, which are designed for sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model's predicted distribution, while strictly satisfying any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a 68 percentage point improvement over unconstrained inference
Community
Code to be released very soon!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding (2025)
- Earley-Driven Dynamic Pruning for Efficient Structured Decoding (2025)
- Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding (2025)
- Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding (2025)
- Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models (2025)
- d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning (2025)
- dKV-Cache: The Cache for Diffusion Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper