Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
PythonModelCloud/GPTQModel

GPTQModel

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

82.9/100
1.0KForks: 164
View on GitHubHomepage →
Loading report...

Similar Projects

auto-round

77

🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality degradation across Weight-Only Quantization, MXFP4, NVFP4, GGUF, and adaptive schemes.

Python875

LlamaFactory

92

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python68.0K

peft

91

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python20.7K

OpenRLHF

89

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python9.1K
Back to List