⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

C++Tiiny-AI/PowerInfer

PowerInfer

High-speed Large Language Model Serving for Local Deployment

58.3/100

★ 9.7KForks: 590

View on GitHub →

Loading report...

Similar Projects

lemonade

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

C++★ 5.1K

ZhiLight

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++★ 905

LLM-Hub

Local AI Assistant on your phone

C++★ 512

deeplake

Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.

C++★ 9.2K

← Back to List