⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

PythonServerlessLLM/ServerlessLLM

ServerlessLLM

Serverless LLM Serving for Everyone.

68.5/100

★ 675Forks: 71

View on GitHub →Homepage →

Loading report...

Similar Projects

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python★ 77.8K

ml-engineering

Machine Learning Engineering Open Book

Python★ 17.8K

TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

Python★ 13.5K

LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python★ 8.1K

← Back to List