Skip to content

Commit 9bd06b8

Browse files
ai-edge-botcopybara-github
authored andcommitted
Updated the README with NPU information.
LiteRT-LM-PiperOrigin-RevId: 775322030
1 parent 19154af commit 9bd06b8

File tree

1 file changed

+40
-18
lines changed

1 file changed

+40
-18
lines changed

README.md

Lines changed: 40 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -22,15 +22,32 @@ the community feedback regarding Google AI Edge's Gemma 3n LiteRT preview. You
2222
want access on more platforms, more visibility into the underlying stack, and
2323
more flexibility. LiteRT-LM can help with all three.
2424

25+
### 🚀 What's New
26+
27+
* ***June 24, 2025*** **: Run Gemma models with NPU Support (`v0.7.0`)**
28+
Unlock significant performance gains! Our latest release leverages the power
29+
of Neural Processing Units (NPUs) on devices with Qualcomm and MediaTek
30+
chipsets to run the Gemma3 1B model with incredible efficiency.
31+
32+
**Note:** LiteRT-LM NPU acceleration is only available through an Early
33+
Access Program. Please check out
34+
[this page](https://ai.google.dev/edge/litert/next/npu) for more information
35+
about how to sign it up.
36+
* ***June 10, 2025*** **: The Debut of LiteRT-LM: A New Framework for
37+
On-Device LLMs** We're proud to release an early preview (`v0.6.1`) of the
38+
LiteRT-LM codebase! This foundational release enables you to run the latest
39+
Gemma series models across a wide range of devices with initial support for
40+
CPU execution and powerful GPU acceleration on Android.
41+
2542
### Supported Backends & Platforms
2643

27-
Platform | CPU Support | GPU Support
28-
:----------- | :---------: | :-----------:
29-
**Android** | ✅ | ✅
30-
**macOS** | ✅ | *Coming Soon*
31-
**Windows** | ✅ | *Coming Soon*
32-
**Linux** | ✅ | *Coming Soon*
33-
**Embedded** | ✅ | *Coming Soon*
44+
Platform | CPU Support | GPU Support | NPU Support |
45+
:----------- | :---------: | :-----------: | :-----------:
46+
**Android** | ✅ | ✅ | ✅ |
47+
**macOS** | ✅ | *Coming Soon* | - |
48+
**Windows** | ✅ | *Coming Soon* | - |
49+
**Linux** | ✅ | *Coming Soon* | - |
50+
**Embedded** | ✅ | *Coming Soon* | - |
3451

3552
### Supported Models and Performance
3653

@@ -46,17 +63,18 @@ Below are the performance numbers of running each model on various devices. Note
4663
that the benchmark is measured with 1024 tokens prefill and 256 tokens decode (
4764
with performance lock on Android devices).
4865

49-
| Model | Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) |
50-
| :--- | :--- | :--- | :--- | :--- |
51-
| Gemma3-1B | MacBook Pro<br>(2023 M3) | CPU | 422.98 | 66.89 |
52-
| Gemma3-1B | Samsung S24<br>(Ultra) | CPU | 243.24 | 43.56 |
53-
| Gemma3-1B | Samsung S24<br>(Ultra) | GPU | 1876.5 | 44.57 |
54-
| Gemma3n-E2B | MacBook Pro<br>(2023 M3) | CPU | 232.5 | 27.6 |
55-
| Gemma3n-E2B | Samsung S24<br>(Ultra) | CPU | 110.5 | 16.1 |
56-
| Gemma3n-E2B | Samsung S24<br>(Ultra) | GPU | 816.4 | 15.6 |
57-
| Gemma3n-E4B | MacBook Pro<br>(2023 M3) | CPU | 170.1 | 20.1 |
58-
| Gemma3n-E4B | Samsung S24<br>(Ultra) | CPU | 73.5 | 9.2 |
59-
| Gemma3n-E4B | Samsung S24<br>(Ultra) | GPU | 548.0 | 9.4 |
66+
| Model | Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) | Context size |
67+
| :--- | :--- | :--- | :--- | :--- | :--- |
68+
| Gemma3-1B | MacBook Pro<br>(2023 M3) | CPU | 422.98 | 66.89 | 4096 |
69+
| Gemma3-1B | Samsung S24<br>(Ultra) | CPU | 243.24 | 43.56 | 4096 |
70+
| Gemma3-1B | Samsung S24<br>(Ultra) | GPU | 1876.5 | 44.57 | 4096 |
71+
| Gemma3-1B | Samsung S24<br>(Ultra) | NPU | 5836.6 | 84.8 | 1280 |
72+
| Gemma3n-E2B | MacBook Pro<br>(2023 M3) | CPU | 232.5 | 27.6 | 4096 |
73+
| Gemma3n-E2B | Samsung S24<br>(Ultra) | CPU | 110.5 | 16.1 | 4096 |
74+
| Gemma3n-E2B | Samsung S24<br>(Ultra) | GPU | 816.4 | 15.6 | 4096 |
75+
| Gemma3n-E4B | MacBook Pro<br>(2023 M3) | CPU | 170.1 | 20.1 | 4096 |
76+
| Gemma3n-E4B | Samsung S24<br>(Ultra) | CPU | 73.5 | 9.2 | 4096 |
77+
| Gemma3n-E4B | Samsung S24<br>(Ultra) | GPU | 548.0 | 9.4 | 4096 |
6078

6179
## Quick Start
6280

@@ -254,6 +272,10 @@ properly installed
254272
[Android Debug Bridge](https://developer.android.com/tools/adb) and have a
255273
connected device that can be accessed via `adb`.
256274

275+
**Note:** If you are interested in trying out LiteRT-LM with NPU acceleration,
276+
please check out [this page](https://ai.google.dev/edge/litert/next/npu) for
277+
more information about how to sign it up for an Early Access Program.
278+
257279
<details>
258280
<summary><strong>Develop in Linux</strong></summary>
259281

0 commit comments

Comments
 (0)