@@ -22,15 +22,32 @@ the community feedback regarding Google AI Edge's Gemma 3n LiteRT preview. You
2222want access on more platforms, more visibility into the underlying stack, and
2323more flexibility. LiteRT-LM can help with all three.
2424
25+ ### 🚀 What's New
26+
27+ * *** June 24, 2025*** ** : Run Gemma models with NPU Support (` v0.7.0 ` )**
28+ Unlock significant performance gains! Our latest release leverages the power
29+ of Neural Processing Units (NPUs) on devices with Qualcomm and MediaTek
30+ chipsets to run the Gemma3 1B model with incredible efficiency.
31+
32+ ** Note:** LiteRT-LM NPU acceleration is only available through an Early
33+ Access Program. Please check out
34+ [ this page] ( https://ai.google.dev/edge/litert/next/npu ) for more information
35+ about how to sign it up.
36+ * *** June 10, 2025*** ** : The Debut of LiteRT-LM: A New Framework for
37+ On-Device LLMs** We're proud to release an early preview (` v0.6.1 ` ) of the
38+ LiteRT-LM codebase! This foundational release enables you to run the latest
39+ Gemma series models across a wide range of devices with initial support for
40+ CPU execution and powerful GPU acceleration on Android.
41+
2542### Supported Backends & Platforms
2643
27- Platform | CPU Support | GPU Support
28- :----------- | :---------: | :-----------:
29- ** Android** | ✅ | ✅
30- ** macOS** | ✅ | * Coming Soon*
31- ** Windows** | ✅ | * Coming Soon*
32- ** Linux** | ✅ | * Coming Soon*
33- ** Embedded** | ✅ | * Coming Soon*
44+ Platform | CPU Support | GPU Support | NPU Support |
45+ :----------- | :---------: | :-----------: | :-----------:
46+ ** Android** | ✅ | ✅ | ✅ |
47+ ** macOS** | ✅ | * Coming Soon* | - |
48+ ** Windows** | ✅ | * Coming Soon* | - |
49+ ** Linux** | ✅ | * Coming Soon* | - |
50+ ** Embedded** | ✅ | * Coming Soon* | - |
3451
3552### Supported Models and Performance
3653
@@ -46,17 +63,18 @@ Below are the performance numbers of running each model on various devices. Note
4663that the benchmark is measured with 1024 tokens prefill and 256 tokens decode (
4764with performance lock on Android devices).
4865
49- | Model | Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) |
50- | :--- | :--- | :--- | :--- | :--- |
51- | Gemma3-1B | MacBook Pro<br >(2023 M3) | CPU | 422.98 | 66.89 |
52- | Gemma3-1B | Samsung S24<br >(Ultra) | CPU | 243.24 | 43.56 |
53- | Gemma3-1B | Samsung S24<br >(Ultra) | GPU | 1876.5 | 44.57 |
54- | Gemma3n-E2B | MacBook Pro<br >(2023 M3) | CPU | 232.5 | 27.6 |
55- | Gemma3n-E2B | Samsung S24<br >(Ultra) | CPU | 110.5 | 16.1 |
56- | Gemma3n-E2B | Samsung S24<br >(Ultra) | GPU | 816.4 | 15.6 |
57- | Gemma3n-E4B | MacBook Pro<br >(2023 M3) | CPU | 170.1 | 20.1 |
58- | Gemma3n-E4B | Samsung S24<br >(Ultra) | CPU | 73.5 | 9.2 |
59- | Gemma3n-E4B | Samsung S24<br >(Ultra) | GPU | 548.0 | 9.4 |
66+ | Model | Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) | Context size |
67+ | :--- | :--- | :--- | :--- | :--- | :--- |
68+ | Gemma3-1B | MacBook Pro<br >(2023 M3) | CPU | 422.98 | 66.89 | 4096 |
69+ | Gemma3-1B | Samsung S24<br >(Ultra) | CPU | 243.24 | 43.56 | 4096 |
70+ | Gemma3-1B | Samsung S24<br >(Ultra) | GPU | 1876.5 | 44.57 | 4096 |
71+ | Gemma3-1B | Samsung S24<br >(Ultra) | NPU | 5836.6 | 84.8 | 1280 |
72+ | Gemma3n-E2B | MacBook Pro<br >(2023 M3) | CPU | 232.5 | 27.6 | 4096 |
73+ | Gemma3n-E2B | Samsung S24<br >(Ultra) | CPU | 110.5 | 16.1 | 4096 |
74+ | Gemma3n-E2B | Samsung S24<br >(Ultra) | GPU | 816.4 | 15.6 | 4096 |
75+ | Gemma3n-E4B | MacBook Pro<br >(2023 M3) | CPU | 170.1 | 20.1 | 4096 |
76+ | Gemma3n-E4B | Samsung S24<br >(Ultra) | CPU | 73.5 | 9.2 | 4096 |
77+ | Gemma3n-E4B | Samsung S24<br >(Ultra) | GPU | 548.0 | 9.4 | 4096 |
6078
6179## Quick Start
6280
@@ -254,6 +272,10 @@ properly installed
254272[ Android Debug Bridge] ( https://developer.android.com/tools/adb ) and have a
255273connected device that can be accessed via ` adb ` .
256274
275+ ** Note:** If you are interested in trying out LiteRT-LM with NPU acceleration,
276+ please check out [ this page] ( https://ai.google.dev/edge/litert/next/npu ) for
277+ more information about how to sign it up for an Early Access Program.
278+
257279<details >
258280<summary ><strong >Develop in Linux</strong ></summary >
259281
0 commit comments