FACTS Benchmark Suite: a new way to systematically evaluate LLMs factuality
The FACTS Benchmark Suite provides a systematic evaluation of Large Language Models (LLMs) factuality across three areas: Parametric, Search, and Multimodal reasoning.
Google DeepMind ยท The FACTS team