Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
C++vectorch-ai/ScaleLLM

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

61.8/100
500Forks: 40
View on GitHubHomepage →
Loading report...

Similar Projects

ZhiLight

63

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++905

whisper.cpp

86

Port of OpenAI's Whisper model in C/C++

C++49.3K

runanywhere-sdks

81

Production ready toolkit to run AI locally

C++10.4K

PowerInfer

56

High-speed Large Language Model Serving for Local Deployment

C++9.4K
Back to List