Gemini 3.5 Flash
Google·May 2026reasoningproprietary
Gemini 3.5 Flash, launched at Google I/O 2026, collapsed much of the gap between Google's fast tier and its Pro tier — scoring 78.8% on SWE-bench Verified, within a few points of much pricier flagships. It takes text, images, audio and video natively in a 1M-token context. At $1.50/$9 it sits in an aggressive middle ground: pricier than older Flash models, but far below GPT-5.5 and Opus 4.8 for a large share of their capability.
Benchmark results
Where it shines
- Flash-tier latency with near-flagship coding quality
- Native video and audio understanding
- Deep integration with Vertex AI, AI Studio and Workspace
Alternatives to Gemini 3.5 Flash
Alibaba's trillion-parameter API flagship — frontier-adjacent quality with strong agentic tool use at mid-tier prices.
OpenAI's flagship reasoning model with a 1M-token context window, built for hard coding, science and long-horizon agentic work.
OpenAI's August 2025 unified flagship that merged the GPT and o-series reasoning lines into one model with selectable effort.
OpenAI's dedicated 2025 reasoning model that pioneered thinking-with-images and agentic tool use within chain-of-thought.
Frequently asked questions
- How much does the Gemini 3.5 Flash API cost?
- Gemini 3.5 Flash costs $1.5 per million input tokens and $9 per million output tokens, with cached input at $0.38 per million. A workload of 10M input and 1.5M output tokens per month costs about $28.50.
- What is the context window of Gemini 3.5 Flash?
- Gemini 3.5 Flash supports a context window of 1,000,000 tokens (1M), with up to 66K output tokens per response.
- Is Gemini 3.5 Flash open source?
- No — Gemini 3.5 Flash is a proprietary model available through Google's API and partner platforms.
- What are the best alternatives to Gemini 3.5 Flash?
- The closest alternatives by overall capability are Qwen3-Max, GPT-5.5, GPT-5, OpenAI o3. See the comparison pages for detailed head-to-head breakdowns.