Skip to content

Commit 5611cc3

Browse files
kimishpatelfacebook-github-bot
authored andcommitted
Update iphone 15 pro benchmarking numbers (#2927)
Summary: ATT Created from CodeHub with https://fburl.com/edit-in-codehub Reviewed By: mergennachin Differential Revision: D55895703
1 parent 599cfde commit 5611cc3

File tree

1 file changed

+9
-6
lines changed

1 file changed

+9
-6
lines changed

examples/models/llama2/README.md

+9-6
Original file line numberDiff line numberDiff line change
@@ -32,15 +32,19 @@ Note that groupsize less than 128 was not enabled, since such model were still t
3232

3333
## Performance
3434

35-
Performance was measured on Samsung Galaxy S22, S23, S24 and One Plus 12. Measurement performance is in terms of tokens/second.
35+
Performance was measured on Samsung Galaxy S22, S24, One Plus 12 and iPhone 15 max Pro. Measurement performance is in terms of tokens/second.
3636

3737
|Device | Groupwise 4-bit (128) | Groupwise 4-bit (256)
3838
|--------| ---------------------- | ---------------
39-
|Galaxy S22 | 8.15 tokens/second | 8.3 tokens/second |
40-
|Galaxy S24 | 10.66 tokens/second | 11.26 tokens/second |
41-
|One plus 12 | 11.55 tokens/second | 11.6 tokens/second |
42-
|iPhone 15 pro | x | x |
39+
|Galaxy S22* | 8.15 tokens/second | 8.3 tokens/second |
40+
|Galaxy S24* | 10.66 tokens/second | 11.26 tokens/second |
41+
|One plus 12* | 11.55 tokens/second | 11.6 tokens/second |
42+
|Galaxy S22** | 5.5 tokens/second | 5.9 tokens/second |
43+
|iPhone 15 pro** | ~6 tokens/second | ~6 tokens/second |
4344

45+
*: Measured via adb binary based [workflow](#Step-5:-Run-benchmark-on-Android-phone)
46+
47+
**: Measured via app based [workflow](#Step-6:-Build-Mobile-apps)
4448

4549
# Instructions
4650

@@ -241,7 +245,6 @@ Please refer to [this tutorial](https://pytorch.org/executorch/main/llm/llama-de
241245
- Enabling LLama2 7b and other architectures via Vulkan
242246
- Enabling performant execution of widely used quantization schemes.
243247
244-
TODO
245248
246249
# Notes
247250
This example tries to reuse the Python code, with minimal modifications to make it compatible with current ExecuTorch:

0 commit comments

Comments
 (0)