
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs
5/12/2026
61
Complete analysis of Google's Gemini 3 Pro: 1,501 LMArena score, 1M token context, PhD-level reasoning, and the ecosystem advantage competitors can't replicate.