Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonvllm-project/vllm
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs