Skip to content

evald.ai

Sources Topics Entities Jobs Log in

evald.ai METR Blog

Portable Evaluation Tasks via the METR Task Standard

February 29, 2024 08:00

METR has published a standard way to define tasks for evaluating the capabilities of AI agents.

Powered by Curated.cx