Move over autoregressive models, there's a new sheriff in town! Meet Diffusion Large Language Models (dLLMs), a new approach to AI that flips the script on how text generation works.
What is a dLLM?
If you're familiar with AI-generated images, you've probably heard of diffusion models. They start with noise and gradually refine it into a coherent image.
Well, someone had a bright idea: what if we did the same for text? That's exactly what dLLMs do!
Unlike traditional LLMs (like GPT-4 or Llama 3.2) that predict one token at a time, dLLMs generate an entire text sequence and refine it in multiple steps; which means faster, more structured, and “smarter” text generation.
Mercury, the first commercial-scale dLLM
While not the first of its kind1, Mercury is the first commercial-scale dLLM. It was recently unveiled by Inception Labs, and it's been attracting a lot of attention!
Why? Because it can generate over 1000 tokens per second on an NVIDIA H100 — blowing traditional models out of the water while keeping high quality. If you've ever waited for a slow AI response, you know why this is a big deal.

And for devs? There's Mercury Coder, a version optimized for writing code. Benchmarks suggest it's on par than gpt-4o-mini and Claude 3.5 Haiku, but up to 10x faster. Imagine getting instant code completions while maintaining high quality — this is a game changer!

Why Should You Care?
Beyond just speed, dLLMs offer some other perks:
- Better Reasoning; since they refine text over multiple steps, they can catch errors and improve coherence as the response is generating.
- Multimodal Potential. Diffusion models already power text-to-image, video, and music AI — so imagine what a unified dLLM could do.
- More Control. Structured generation means better function calling and more reliable outputs.
The Future is Diffused
The introduction of dLLMs marks a major shift in AI development. They have the potential to revolutionize chatbots, code generation, and long-form content creation. Diffusion models took image AI to the next level, and they might do just the same for text.
Curious? Check out Inception Labs' Website to learn more!
What do you think? Are dLLMs the next big thing, or just another AI experiment? Let's chat in the comments! 👇
Thanks for reading!
Cover image credit: https://inceptionlabs.ai
Article by BestCodes
-
Footnote 1 (“not the first of its kind”): Other non-commercial-scale dLLMs have been produced. See for example: https://arxiv.org/abs/2310.17680 ↩
Top comments (1)
wow, this is crazy