Grok 4
xAI·Jul 2025reasoningproprietary
Grok 4 marked xAI's arrival as a genuine frontier lab: top-tier GPQA and AIME scores, native tool use during reasoning, and live X data integration. Trained on the Colossus supercluster, it briefly led several reasoning leaderboards in mid-2025. The 4.x line has since iterated quickly; Grok 4 remains the most thoroughly benchmarked xAI model and a common comparison anchor.
Benchmark results
Where it shines
- Excellent competition math and hard-science reasoning
- Native tool use inside its chain of thought
- Real-time knowledge through X integration
Alternatives to Grok 4
Alibaba's trillion-parameter API flagship — frontier-adjacent quality with strong agentic tool use at mid-tier prices.
OpenAI's flagship reasoning model with a 1M-token context window, built for hard coding, science and long-horizon agentic work.
Google's November 2025 frontier breakout — 91.9% GPQA Diamond and 37.5% HLE made it the reasoning leader of its generation.
Google's 2025 workhorse flagship — first mainstream thinking model with a 1M context, still widely deployed.
Frequently asked questions
- How much does the Grok 4 API cost?
- Grok 4 costs $3 per million input tokens and $15 per million output tokens, with cached input at $0.75 per million. A workload of 10M input and 1.5M output tokens per month costs about $52.50.
- What is the context window of Grok 4?
- Grok 4 supports a context window of 256,000 tokens (256K).
- Is Grok 4 open source?
- No — Grok 4 is a proprietary model available through xAI's API and partner platforms.
- What are the best alternatives to Grok 4?
- The closest alternatives by overall capability are Qwen3-Max, GPT-5.5, Gemini 3 Pro, Gemini 2.5 Pro. See the comparison pages for detailed head-to-head breakdowns.