⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

Pythontonbistudio/turboquant-pytorch

turboquant-pytorch

From-scratch PyTorch implementation of Google's TurboQuant (ICLR 2026) for LLM KV cache compression. 5x compression at 3-bit with 99.5% attention fidelity.

52.7/100

★ 996Forks: 134

View on GitHub →

Loading report...

Similar Projects

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python★ 184.3K

transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python★ 160.6K

hermes-agent

The agent that grows with you

Python★ 152.1K

langflow

Langflow is a powerful tool for building and deploying AI-powered agents and workflows.

Python★ 148.1K

← Back to List