MMLU-Pro

Knowledge

MMLU-Pro (10-choice reasoning-heavy MMLU)

A harder successor to MMLU: 12,000 questions across 14 domains with ten answer choices instead of four and a much larger share of reasoning-dependent problems. The broadest single measure of general knowledge plus reasoning.

1GPT-5OpenAI$1.25/$10~87%2Gemini 2.5 ProGoogle$1.25/$1086.2%3Qwen3-MaxAlibaba (Qwen)$1.2/$6~85.2%4DeepSeek V3.2DeepSeek$0.28/$0.4285%5DeepSeek R1 (0528)DeepSeek$0.55/$2.1984.8%6Kimi K2 ThinkingMoonshot AI$0.6/$2.5~84.6%7Qwen3-235B-A22BAlibaba (Qwen)$0.22/$0.8884.4%8GLM-4.6Z.ai (Zhipu)$0.6/$2.2~82.8%9MiniMax M2MiniMax$0.3/$1.2~82%10Llama 4 MaverickMeta$0.27/$0.8580.5%11Llama 4 ScoutMeta$0.18/$0.5974.3%

~ marks community-reported or version-normalized figures; all others come from official model cards. Prices shown as input/output per 1M tokens. Updated 2026-06-10.