Google's top scientist to European Commission: In less than 2 hours, our Red team 'hacked' the system yo - The Times of India
Google's top scientist to European Commission: In less than 2 hours, our Red team 'hacked' the system yo The Times of India
Concept
Google's top scientist to European Commission: In less than 2 hours, our Red team 'hacked' the system yo The Times of India
External review from METR of the "Risks from automated R&D" section in Anthropic's February 2026 Risk Report
Youth AI Safety Independent Testing Regime Launch - Hot Community Stocks newser.com
MLCommons introduces Continuous Prompt Stewardship to keep the AILuminate AI safety benchmark fresh and reliable as frontier models evolve.
Google DeepMind releases new findings and an evaluation framework to measure AI's potential for harmful manipulation in areas like finance and health, with the goal of enhancing AI safety.
MLCommons is developing the AILuminate Culturally-Specific Multimodal Benchmark to close the AI performance and representation gap across APAC cultures, languages, and real-world use cases.
External review from METR of Anthropic's Sabotage Risk Report for Claude Opus 4.6
OpenAI amends its Pentagon deal after Altman admits it looked 'opportunistic and sloppy', while Claude surges to number one on the App Store and hundreds of employees publicly back Anthropic's stance.
Defense Secretary Pete Hegseth gives Anthropic until Friday to provide military access to Claude or face being declared a supply chain risk or forced compliance under the Defense Production Act.
Luca Righetti shares takeaways on the role of randomized controlled trials in AI safety testing.
Miles Kodama and Michael Chen summarize key provisions from California's SB 53, the EU Code of Practice, and New York's RAISE Act covering frontier AI developers.
OpenAI is updating its Model Spec with new Under-18 Principles that define how ChatGPT should support teens with safe, age-appropriate guidance grounded in developmental science. The update strengthens guardrails, clarifies expected model behavior in...