⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

Pythonhuggingface/lighteval

lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

76.7/100

★ 2.5KForks: 514

View on GitHub →Homepage →

Loading report...

Similar Projects

deepeval

The LLM Evaluation Framework

Python★ 17.1K

mlflow

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

Python★ 27.2K

ragas

Supercharge Your LLM Application Evaluations 🚀

Python★ 15.0K

oumi

Easily fine-tune, evaluate and deploy Gemma 4, Qwen3.5, Qwen3.6, gpt-oss, DeepSeek-R1, or any open source LLM / VLM!

Python★ 9.4K

← Back to List