Fresh Benchmarks, Reliable Scores: Introducing Continuous Prompt Stewardship for AI Risk Evaluation - MLCommons
MLCommons introduces Continuous Prompt Stewardship to keep the AILuminate AI safety benchmark fresh and reliable as frontier models evolve.
MLCommons ยท lori@mlcommons.org