Gemini 3.1 Flash-Lite

Google·Mar 2026proprietary

Gemini 3.1 Flash-Lite is Google's high-volume tier, designed for workloads where unit economics dominate: bulk summarization, media tagging, classification and high-QPS chat. Unusually for its price class it keeps full multimodal input (including video) and the 1M-token context window, which makes it a strong pick for cheap first-pass analysis of large media libraries.

Benchmark results

No verified benchmark results tracked yet for Gemini 3.1 Flash-Lite. This page is updated as official evaluations are published.

Where it shines

  • Multimodal input at commodity pricing
  • 1M context in the budget tier
  • Very high rate limits for production scale

Alternatives to Gemini 3.1 Flash-Lite

Frequently asked questions

How much does the Gemini 3.1 Flash-Lite API cost?
Gemini 3.1 Flash-Lite costs $0.25 per million input tokens and $1.5 per million output tokens. A workload of 10M input and 1.5M output tokens per month costs about $4.75.
What is the context window of Gemini 3.1 Flash-Lite?
Gemini 3.1 Flash-Lite supports a context window of 1,000,000 tokens (1M), with up to 66K output tokens per response.
Is Gemini 3.1 Flash-Lite open source?
No — Gemini 3.1 Flash-Lite is a proprietary model available through Google's API and partner platforms.
What are the best alternatives to Gemini 3.1 Flash-Lite?
The closest alternatives by overall capability are Qwen3-Max, GPT-5.5, GPT-5, OpenAI o3. See the comparison pages for detailed head-to-head breakdowns.