Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonopenvinotoolkit/nncf

nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

81.0/100
1.1KForks: 296
View on GitHub
Loading report...

Similar Projects

neural-compressor

90

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python2.6K

Chinese-LLaMA-Alpaca

90

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python18.9K

llm-compressor

84

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python3.1K

z80ai

63

Z80-μLM is a 2-bit quantized language model small enough to run on an 8-bit Z80 processor. Train conversational models in Python, export them as CP/M .COM binaries, and chat with your vintage computer.

Python1.1K
Back to List