返回 2026-06-12
🤖 AI / ML

DiffusionGemma:Google 基于扩散模型的文本生成新方案DiffusionGemma

simonwillison.net·2026-06-10 节选正文

Google 去年曾短暂发布实验性的 Gemini Diffusion 模型,预览阶段实测速度达到 857 tokens/秒,但随后再无消息。如今该研究以 DiffusionGemma 之名回归,将扩散模型架构应用于文本生成任务。这标志着 Google 在非自回归文本生成路线上的持续投入。DiffusionGemma 有望为高速文本推理提供新的技术路径。

Simon Willison

10th June 2026 - Link Blog

DiffusionGemma (via) Last May Google briefly released an experimental Gemini Diffusion model. I tried the preview at the time and recorded it running at 857 tokens/second. It was an exciting model, but Google made no further announcements about it.

That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, google/diffusiongemma-26B-A4B-it.

NVIDIA are currently hosting the model for free on their NIM cloud API. I used that API to generate this pelican, which took 4.4s (according to time uv run generate.py) to return 2,409 tokens - so at least 500 tokens/second.

Posted 10th June 2026 at 8 pm

Recent articles

  • Initial impressions of Claude Fable 5 - 9th June 2026
  • Running Python code in a sandbox with MicroPython and WASM - 6th June 2026
  • Claude Opus 4.8: "a modest but tangible improvement" - 28th May 2026
  • This is a link post by Simon Willison, posted on 10th June 2026.

    google 411 ai 2,066 generative-ai 1,824 llms 1,792 nvidia 18 pelican-riding-a-bicycle 118 gemma 15 llm-release 205 llm-performance 16

    Monthly briefing

    Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

    Pay me to send you less!

    需要完整排版与评论请前往来源站点阅读。