Skip to content

Commit 94010b3

Browse files
committed
Docs for lower smaller models to mps/coreml/qnn
Pull Request resolved: #3146 ghstack-source-id: 223235858 Differential Revision: [D56340028](https://our.internmc.facebook.com/intern/diff/D56340028/)
1 parent fa433cb commit 94010b3

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

examples/models/llama2/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,16 @@ Please refer to [this tutorial](https://pytorch.org/executorch/main/llm/llama-de
260260
### Android
261261
Please refer to [this tutorial](https://pytorch.org/executorch/main/llm/llama-demo-android.html) to for full instructions on building the Android LLAMA Demo App.
262262
263+
## Optional: Smaller models delegated to other backends
264+
Currently we supported lowering the stories model to other backends, including, CoreML, MPS and QNN. Please refer to the instruction
265+
for each backend ([CoreML](https://pytorch.org/executorch/main/build-run-coreml.html), [MPS](https://pytorch.org/executorch/main/build-run-mps.html), [QNN](https://pytorch.org/executorch/main/build-run-qualcomm.html)) before trying to lower them. After the backend library is installed, the script to export a lowered model is
266+
267+
- Lower to CoreML: `python -m examples.models.llama2.export_llama -kv --coreml -c stories110M.pt -p params.json`
268+
- MPS: `python -m examples.models.llama2.export_llama -kv --mps -c stories110M.pt -p params.json`
269+
- QNN: `python -m examples.models.llama2.export_llama -kv --qnn -c stories110M.pt -p params.json`
270+
271+
The iOS LLAMA app supports the CoreML and MPS model and the Android LLAMA app supports the QNN model. On Android, it also allow to cross compiler the llama runner binary, push to the device and run.
272+
263273
# What is coming next?
264274
## Quantization
265275
- Enabling FP16 model to leverage smaller groupsize for 4-bit quantization.

0 commit comments

Comments
 (0)