Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonclaw-eval/claw-eval

claw-eval

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

59.0/100
500Forks: 40
View on GitHubHomepage →
Loading report...

Similar Projects

nexent

87

Nexent is a zero-code platform for auto-generating production-grade AI agents using Harness Engineering principles — unified tools, skills, memory, and orchestration with built-in constraints, feedback loops, and control planes.

Python4.4K

PPTAgent

85

An Agentic Framework for Reflective PowerPoint Generation

Python4.2K

Clawith

83

OpenClaw for Teams

Python3.4K

MetaClaw

77

🦞 Just talk to your agent — it learns and EVOLVES 🧬.

Python3.4K
Back to List