Details about METR's evaluation of OpenAI GPT-5.1-Codex-Max
We evaluate whether GPT-5.1-Codex-Max poses significant catastrophic risks via AI self-improvement, rogue replication, or sabotage of AI labs. We conclude that this seems unlikely.
We evaluate whether GPT-5.1-Codex-Max poses significant catastrophic risks via AI self-improvement, rogue replication, or sabotage of AI labs. We conclude that this seems unlikely.