Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark - Analytics India Magazine
Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark Analytics India Magazine
Product
Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark Analytics India Magazine
External review from METR of Anthropic's Sabotage Risk Report for Claude Opus 4.6
How Anthropic’s Claude Opus 4.6 Broke Its Own AI Benchmark WinBuzzer