google-ai-edge / LiteRT-LM
LiteRT-LM is a high-performance, production-ready inference framework designed by Google for deploying Large Language Models on edge devices. It supports a wide range of platforms including Android, iOS, desktop, and IoT, while leveraging GPU and NPU hardware acceleration for optimal performance. The framework enables advanced capabilities such as multi-modality and function calling, powering on-device AI experiences in various Google products.