ovr.news

Solutions that work, including long-horizon plans with outcomes

Google Releases Faster Gemini 3.5 Flash

arstechnica.com · 19 May 2026
Google Releases Faster Gemini 3.5 Flash
Photo: arstechnica.com
Read on arstechnica.com

Why this is here: Gemini 3.5 Flash outputs nearly 300 tokens per second, a rate comparable to larger models that generate output at a quarter of that speed.

Google is rolling out Gemini 3.5 Flash across its products today. Tulsee Doshi, a senior director at Google, says the new model combines high intelligence with improved efficiency. Gemini 3.5 Flash can generate roughly 300 tokens per second, matching the speed of larger models while maintaining similar performance scores.

The company believes this efficiency addresses a key limitation of generative AI—its cost. Google estimates companies using substantial AI resources could save about one billion dollars annually by switching to Gemini 3.5 Flash. Pricing reflects this, with input tokens costing $1.50 per million and output tokens at $9 per million.

The 3.1 Pro model currently costs $2 to $12 per million tokens, depending on usage. Google frames this release as an initial step, suggesting further integration and optimization are planned.

Surfaced by the Solutions lens — one of the vital signs ovr.news reads.

How we evaluated this
AI summary

read the original for the full story — Read on arstechnica.com . How we work →

Why are you reporting this article?

Why are you reporting this article?