METR Blog page 7

METR Blog December 04, 2023 08:00

ARC Evals is now METR

ARC Evals is wrapping up our incubation period at ARC, and spinning off into our own standalone nonprofit.

METR Blog September 26, 2023 10:00

Responsible Scaling Policies (RSPs)

We describe the basic components of Responsible Scaling Policies (RSPs) as well as why we find them promising for reducing catastrophic risks from AI.

METR Blog September 19, 2023 15:22

ARC Evals is spinning out from ARC

ARC Evals plans to spin out from the Alignment Research Center (ARC) in the coming months, and become its own standalone organization.

METR Blog July 31, 2023 20:00

New report: Evaluating Language-Model Agents on Realistic Autonomous Tasks

We have just released our first public report. It introduces methodology for assessing the capacity of LLM agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild.

METR Blog June 11, 2023 12:00

ERROR: The request could not be satisfied

Input to NTIA’s AI Accountability Policy Request for Comment.

METR Blog March 17, 2023 15:22

Update on ARC's recent eval efforts

More information about ARC's evaluations of GPT-4 and Claude

LLM Evaluation

Claude LLM Evaluation