METR Blog

Autonomy Evaluation Resources

A collection of resources for evaluating potentially dangerous autonomous capabilities of frontier models.

More stories

More stories load automatically as you scroll.