Evaluation Guidebook - a Hugging Face Space by OpenEvals
This page automatically loads score data from several LLM leaderboards and shows an interactive chart that tracks how top benchmark results have changed. The chart groups benchmarks by category, hi...