Run PyTorch LLMs locally on servers, desktop and mobile
A high-throughput and memory-efficient inference and serving engine for LLMs
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Machine Learning Engineering Open Book
Nano vLLM