OpenAI Evaluation Filter

Measuring AI’s capability to accelerate biological research in the wet lab

OpenAI introduces a real-world evaluation framework to measure how AI can accelerate biological research in the wet lab. Using GPT-5 to optimize a molecular cloning protocol, the work explores both the promise and risks of AI-assisted experimentation.

OpenAI Evaluation Filter

Advancing science and math with GPT-5.2

GPT-5.2 is OpenAI’s strongest model yet for math and science, setting new state-of-the-art results on benchmarks like GPQA Diamond and FrontierMath. This post shows how those gains translate into real research progress, including solving an open theoretical...

Google DeepMind Evaluation Filter

Deepening AI Safety Research with UK AI Security Institute (AISI)

Google DeepMind and the UK AI Security Institute (AISI) strengthen collaboration through a new research partnership, focusing on critical safety research areas like monitoring AI reasoning and evalua…

More stories

More stories load automatically as you scroll.