Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
PythonTHUDM/AgentBench

AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

63.4/100
3.2KForks: 240
View on GitHub
Loading report...

Similar Projects

langroid

82

Harness LLMs with Multi-Agent Programming

Python3.9K

gptme

90

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

Python4.2K

AutoPR

82

AutoPR autonomously wrote pull requests in response to issues

Python1.4K

vim-ai

71

AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim and Neovim.

Python1.1K
Back to List