Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonopenvinotoolkit/nncf

nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

81.1/100
1.1KForks: 287
View on GitHub
Loading report...

Similar Projects

neural-compressor

90

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python2.6K

llm-compressor

85

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python2.8K

z80ai

49

Z80-μLM is a 2-bit quantized language model small enough to run on an 8-bit Z80 processor. Train conversational models in Python, export them as CP/M .COM binaries, and chat with your vintage computer.

Python1.0K

LightCompress

57

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

Python684
Back to List