Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonmicrosoft/sarathi-serve
sarathi-serve
A low-latency & high-throughput serving engine for LLMs