Google Releases Faster Gemini 3.5 Flash

arstechnica.com · 19 May 2026

Photo: arstechnica.com

Why this is here: Gemini 3.5 Flash outputs nearly 300 tokens per second, a rate comparable to larger models that generate output at a quarter of that speed.

Google is rolling out Gemini 3.5 Flash across its products today. Tulsee Doshi, a senior director at Google, says the new model combines high intelligence with improved efficiency. Gemini 3.5 Flash can generate roughly 300 tokens per second, matching the speed of larger models while maintaining similar performance scores.

The company believes this efficiency addresses a key limitation of generative AI—its cost. Google estimates companies using substantial AI resources could save about one billion dollars annually by switching to Gemini 3.5 Flash. Pricing reflects this, with input tokens costing $1.50 per million and output tokens at $9 per million.

The 3.1 Pro model currently costs $2 to $12 per million tokens, depending on usage. Google frames this release as an initial step, suggesting further integration and optimization are planned.