Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Cudathu-ml/SpargeAttn

SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

66.0/100
953Forks: 87
View on GitHubHomepage →
Loading report...

Similar Projects

how-to-optim-algorithm-in-cuda

60

how to optimize some algorithm in cuda.

Cuda2.8K

rtp-llm

80

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Cuda1.1K

LlamaFactory

92

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python68.0K

CogVideo

62

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python12.5K
Back to List