Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonopenai/evals
evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.