METR Blog

We spent 2 hours working in the future

Thomas Kwa describes a tabletop exercise where METR researchers simulated having access to ~200-hour time horizon AIs.

MLCommons Evaluation Filter

Bringing Text-to-Video to MLPerf Inference v6.0 - MLCommons

MLCommons introduces the new Text-to-Video benchmark in MLPerf Inference v6.0, based on the Wan2.2-T2V-A14B-Diffusers model and validated using the VBench framework. Learn about the key architectural decisions, including the adoption of the SingleStream...

More stories

More stories load automatically as you scroll.