Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonvllm-project/llm-compressor

llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

85.3/100
2.8KForks: 423
View on GitHubHomepage →
Loading report...

Similar Projects

nncf

81

Neural Network Compression Framework for enhanced OpenVINO™ inference

Python1.1K

neural-compressor

90

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python2.6K

LlamaFactory

92

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python68.0K

faster-whisper

66

Faster Whisper transcription with CTranslate2

Python21.3K
Back to List