OpenAI Evaluation Filter

Evaluating chain-of-thought monitorability

OpenAI introduces a new framework and evaluation suite for chain-of-thought monitorability, covering 13 evaluations across 24 environments. Our findings show that monitoring a model’s internal reasoning is far more effective than monitoring outputs alone,...

OpenAI Evaluation Filter

Updating our Model Spec with teen protections

OpenAI is updating its Model Spec with new Under-18 Principles that define how ChatGPT should support teens with safe, age-appropriate guidance grounded in developmental science. The update strengthens guardrails, clarifies expected model behavior in...

More stories

More stories load automatically as you scroll.