LiteRT-LM is Google's production-ready, high-performance, open-source inference framework for deploying Large Language Models on edge devices.
Deploy Gemma 4 across a broad range of hardware with stellar performance (blog).
👉 Try on Linux, macOS, Windows (WSL) or Raspberry Pi with the LiteRT-LM CLI:
litert-lm run \
--from-huggingface-repo=litert-community/gemma-4-E2B-it-litert-lm \
gemma-4-E2B-it.litertlm \
--prompt="What is the capital of France?"- 📱 Cross-Platform Support: Android, iOS, Web, Desktop, and IoT (e.g. Raspberry Pi).
- 🚀 Hardware Acceleration: Peak performance via GPU and NPU accelerators.
- 👁️ Multi-Modality: Support for vision and audio inputs.
- 🔧 Tool Use: Function calling support for agentic workflows.
- 📚 Broad Model Support: Gemma, Llama, Phi-4, Qwen, and more.
LiteRT-LM powers on-device GenAI experiences in Chrome, Chromebook Plus, Pixel Watch, and more.
You can also try the Google AI Edge Gallery app to run models immediately on your device.
| Install the app today from Google Play | Install the app today from App Store |
|---|---|
![]() |
|
| Link | Description |
|---|---|
| Bring state-of-the-art agentic skills to the edge with Gemma 4 | Deploy Gemma 4 in-app and across a broader range of devices with stellar performance and broad reach using LiteRT-LM. |
| On-device GenAI in Chrome, Chromebook Plus and Pixel Watch | Deploy language models on wearables and browser-based platforms using LiteRT-LM at scale. |
| On-device Function Calling in Google AI Edge Gallery | Explore how to fine-tune FunctionGemma and enable function calling capabilities powered by LiteRT-LM Tool Use APIs. |
| Google AI Edge small language models, multimodality, and function calling | Latest insights on RAG, multimodality, and function calling for edge language models. |
- 👉 Technical Overview including performance benchmarks, model support, and more.
- 👉 LiteRT-LM CLI Guide including installation, getting started, and advanced usage.
Try LiteRT-LM immediately from your terminal without writing a single line of code using uv:
uv tool install litert-lm
litert-lm run \
--from-huggingface-repo=google/gemma-3n-E2B-it-litert-lm \
gemma-3n-E2B-it-int4 \
--prompt="What is the capital of France?"Ready to get started? Explore our language-specific guides and setup instructions.
| Language | Status | Best For... | Documentation |
|---|---|---|---|
| Kotlin | ✅ Stable | Android apps & JVM | Android (Kotlin) Guide |
| Python | ✅ Stable | Prototyping & Scripting | Python Guide |
| C++ | ✅ Stable | High-performance native | C++ Guide |
| Swift | 🚀 In Dev | Native iOS & macOS | (Coming Soon) |
This guide shows how you can compile LiteRT-LM from source.
- v0.9.0: Improvements to function calling capabilities, better app performance stability.
- v0.8.0: Desktop GPU support and Multi-Modality.
- v0.7.0: NPU acceleration for Gemma models.
For a full list of releases, see GitHub Releases.

