Gemini 2.5 Flash-Lite

Google·Jul 2025proprietary

Gemini 2.5 Flash-Lite is the cheapest Gemini 2.5 variant, built for translation, classification and other high-frequency, cost-sensitive jobs. With multimodal input and the 1M context window intact at $0.10/$0.40, it competes head-on with GPT-5 nano at the absolute bottom of the price curve, and it remains attractive even after the Gemini 3.1 Flash-Lite release for purely budget-driven workloads.

Benchmark results

No verified benchmark results tracked yet for Gemini 2.5 Flash-Lite. This page is updated as official evaluations are published.

Where it shines

  • Among the lowest prices of any major-lab model
  • Multimodal + 1M context at the price floor
  • High rate limits for bulk processing

Alternatives to Gemini 2.5 Flash-Lite

Frequently asked questions

How much does the Gemini 2.5 Flash-Lite API cost?
Gemini 2.5 Flash-Lite costs $0.1 per million input tokens and $0.4 per million output tokens, with cached input at $0.03 per million. A workload of 10M input and 1.5M output tokens per month costs about $1.60.
What is the context window of Gemini 2.5 Flash-Lite?
Gemini 2.5 Flash-Lite supports a context window of 1,048,576 tokens (1.0M), with up to 66K output tokens per response.
Is Gemini 2.5 Flash-Lite open source?
No — Gemini 2.5 Flash-Lite is a proprietary model available through Google's API and partner platforms.
What are the best alternatives to Gemini 2.5 Flash-Lite?
The closest alternatives by overall capability are Qwen3-Max, GPT-5.5, GPT-5, OpenAI o3. See the comparison pages for detailed head-to-head breakdowns.