:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Fast Multimodal LLM on Mobile Devices
Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.
High-speed Large Language Model Serving for Local Deployment
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.