Gemini 2.5 Flash

Google·Jun 2025reasoningproprietary

Gemini 2.5 Flash was one of the most-used models of 2025 by request volume — fast, cheap, multimodal and capable enough for the bulk of real-world tasks. It pioneered hybrid thinking budgets in the fast tier, letting developers dial reasoning up or down per request. Gemini 3.5 Flash is its natural successor, at five times the input price but a different capability class.

Benchmark results

Where it shines

  • Excellent throughput economics for chat and RAG
  • Adjustable thinking budget per request
  • Full multimodal input including video

Alternatives to Gemini 2.5 Flash

Frequently asked questions

How much does the Gemini 2.5 Flash API cost?
Gemini 2.5 Flash costs $0.3 per million input tokens and $2.5 per million output tokens, with cached input at $0.07 per million. A workload of 10M input and 1.5M output tokens per month costs about $6.75.
What is the context window of Gemini 2.5 Flash?
Gemini 2.5 Flash supports a context window of 1,048,576 tokens (1.0M), with up to 66K output tokens per response.
Is Gemini 2.5 Flash open source?
No — Gemini 2.5 Flash is a proprietary model available through Google's API and partner platforms.
What are the best alternatives to Gemini 2.5 Flash?
The closest alternatives by overall capability are Qwen3-Max, GPT-5.5, GPT-5, OpenAI o3. See the comparison pages for detailed head-to-head breakdowns.