METR Blog page 5

METR Blog February 12, 2025 08:00

Details about METR's preliminary evaluation of DeepSeek-V3

We evaluated DeepSeek-V3 for dangerous autonomous capabilities and found no evidence of dangerous capabilities beyond those of existing models such as Claude 3.5 Sonnet and GPT-4o. We also confirmed that its performance on GPQA is not due to training data...

Claude

METR Blog February 08, 2025 16:00

Frontier AI Safety Policies

Model Evaluation & Threat Research

Safety Evals LLM Evaluation

METR Blog January 31, 2025 08:00

An update on our preliminary evaluations of Claude 3.5 Sonnet and o1

Preliminary evaluations of Claude 3.5 Sonnet (New) and o1, as well as some discussion of challenges in making capability-based safety arguments for AI models.

Claude

METR Blog January 17, 2025 08:00

AI models can be dangerous before public deployment

Why pre-deployment testing is not an adequate framework for AI risk management

Safety Evals Testing Tools

METR Blog November 22, 2024 08:00

Evaluating frontier AI R&D capabilities of language model agents against human experts

We’re releasing RE-Bench, a new benchmark for measuring the performance of humans and frontier model agents on ML research engineering tasks. We also share data from 71 human expert attempts and results for Anthropic’s Claude 3.5 Sonnet and OpenAI’s...

Benchmarks

Anthropic Claude Benchmarks OpenAI

METR Blog November 12, 2024 08:00

The Rogue Replication Threat Model

Thoughts on how AI agents might develop large and resilient rogue populations.

METR Blog October 11, 2024 18:00

ERROR: The request could not be satisfied

Red-teaming and security suggestions regarding proposed rule by the Bureau of Industry and Security, “Establishment of Reporting Requirements for the Development of Advanced Artificial Intelligence Models and Computing Clusters.”

METR Blog October 09, 2024 07:00

New Support Through The Audacious Project

Funding for Canary will enable research and implementation at scale

METR Blog September 12, 2024 17:00

Details about METR's preliminary evaluation of OpenAI o1-preview

We measured the performance of OpenAI's o1-mini and o1-preview models on our autonomy and AI R&D task suites, and found they did not exceed the capabilities of the best existing public model we've evaluated, though we could not confidently upper-bound...

OpenAI

METR Blog September 08, 2024 18:00

ERROR: The request could not be satisfied

Suggestions for expanded guidance on capability elicitation and robust model safeguards in the U.S. AI Safety Institute’s draft document “Managing Misuse Risk for Dual-Use Foundation Models” (NIST AI 800-1).

Safety Evals

METR Blog August 20, 2024 07:00

Vivaria

Vivaria is METR's tool for running evaluations and conducting agent elicitation research. Vivaria is a web application with which users can interact using a web UI and a command-line interface.

METR Blog August 07, 2024 17:00

Details about METR's preliminary evaluation of GPT-4o

We measured the performance of GPT-4o given a simple agent scaffolding on 77 tasks across 30 task families testing autonomous capabilities.

Testing Tools