MMLU-Pro
KnowledgeMMLU-Pro (10-choice reasoning-heavy MMLU)
A harder successor to MMLU: 12,000 questions across 14 domains with ten answer choices instead of four and a much larger share of reasoning-dependent problems. The broadest single measure of general knowledge plus reasoning.
1GPT-5OpenAI~87%2Gemini 2.5 ProGoogle86.2%3Qwen3-MaxAlibaba (Qwen)~85.2%4DeepSeek V3.2DeepSeek85%5DeepSeek R1 (0528)DeepSeek84.8%6Kimi K2 ThinkingMoonshot AI~84.6%7Qwen3-235B-A22BAlibaba (Qwen)84.4%8GLM-4.6Z.ai (Zhipu)~82.8%9MiniMax M2MiniMax~82%10Llama 4 MaverickMeta80.5%11Llama 4 ScoutMeta74.3%
~ marks community-reported or version-normalized figures; all others come from official model cards. Prices shown as input/output per 1M tokens. Updated 2026-06-10.