Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonturboderp-org/exllamav3
exllamav3
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs