Skip to content

Update Core ML Backend Doc #3188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 90 additions & 25 deletions backends/apple/coreml/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,58 +6,123 @@ Core ML is an optimized framework for running machine learning models on Apple d

## Layout
- `compiler/` : Lowers a module to Core ML backend.
- `partition/`: Partitions a module fully or partially to Core ML backend.
- `quantizer/`: Quantizes a module in Core ML favored scheme.
- `scripts/` : Scripts for installing dependencies and running tests.
- `runtime/`: Core ML delegate runtime implementation.
- `inmemoryfs`: InMemory filesystem implementation used to serialize/de-serialize AOT blob.
- `kvstore`: Persistent Key-Value store implementation.
- `delegate`: Runtime implementation.
- `include` : Public headers.
- `tests` : Tests for Core ML delegate.
- `workspace` : Xcode workspace for tests.
- `sdk` : SDK implementation.
- `tests` : Unit tests.
- `workspace` : Xcode workspace for the runtime.
- `third-party/`: External dependencies.

## Help & Improvements
If you have problems or questions or have suggestions for ways to make
implementation and testing better, please create an issue on [github](https://www.github.com/pytorch/executorch/issues).
## Partition and Delegation

## Delegation

For delegating the Program to the **Core ML** backend, the client must be responsible for calling `to_backend` with the **CoreMLBackend** tag.
To delegate a Program to the **Core ML** backend, the client must call `to_backend` with the **CoreMLPartitioner**.

```python
import executorch.exir as exir
import torch

from torch.export import export

from executorch.exir import to_edge

from executorch.exir.backend.backend_api import to_backend
import executorch.exir

from executorch.backends.apple.coreml.compiler import CoreMLBackend
from executorch.backends.apple.coreml.partition.coreml_partitioner import CoreMLPartitioner

class LowerableSubModel(torch.nn.Module):
class Model(torch.nn.Module):
def __init__(self):
super().__init__()

def forward(self, x):
return torch.sin(x)

# Convert the lowerable module to Edge IR Representation
to_be_lowered = LowerableSubModel()
example_input = (torch.ones(1), )
to_be_lowered_exir_submodule = to_edge(export(to_be_lowered, example_input))
source_model = Model()
example_inputs = (torch.ones(1), )

# Export the source model to Edge IR representation
aten_program = torch.export.export(source_model, example_inputs)
edge_program_manager = executorch.exir.to_edge(aten_program)

# Delegate to Core ML backend
delegated_program_manager = edge_program_manager.to_backend(CoreMLPartitioner())

# Lower to Core ML backend
lowered_module = to_backend('CoreMLBackend', to_be_lowered_exir_submodule.exported_program, [])
# Serialize delegated program
executorch_program = delegated_program_manager.to_executorch()
with open("model.pte", "wb") as f:
f.write(executorch_program.buffer)
```

Currently, the **Core ML** backend delegates the whole module to **Core ML**. If a specific op is not supported by the **Core ML** backend then the `to_backend` call would throw an exception. We will be adding a **Core ML Partitioner** to resolve the issue.
The module will be fully or partially delegated to **Core ML**, depending on whether all or part of ops are supported by the **Core ML** backend. User may force skip certain ops by `CoreMLPartitioner(skip_ops_for_coreml_delegation=...)`

The `to_backend` implementation is a thin wrapper over [coremltools](https://apple.github.io/coremltools/docs-guides/), `coremltools` is responsible for converting an **ExportedProgram** to a **MLModel**. The converted **MLModel** data is saved, flattened, and returned as bytes to **ExecuTorch**.

## Quantization

The `to_backend` implementation is a thin wrapper over `coremltools`, `coremltools` is responsible for converting an **ExportedProgram** to a **MLModel**. The converted **MLModel** data is saved, flattened, and returned as bytes to **ExecuTorch**.
To quantize a Program in a Core ML favored way, the client may utilize **CoreMLQuantizer**.

```python
import torch
import executorch.exir

from torch._export import capture_pre_autograd_graph
from torch.ao.quantization.quantize_pt2e import (
convert_pt2e,
prepare_pt2e,
prepare_qat_pt2e,
)

from executorch.backends.apple.coreml.quantizer.coreml_quantizer import CoreMLQuantizer
from coremltools.optimize.torch.quantization.quantization_config import (
LinearQuantizerConfig,
QuantizationScheme,
)

class Model(torch.nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv = torch.nn.Conv2d(
in_channels=3, out_channels=16, kernel_size=3, padding=1
)
self.relu = torch.nn.ReLU()

def forward(self, x: torch.Tensor) -> torch.Tensor:
a = self.conv(x)
return self.relu(a)

source_model = Model()
example_inputs = (torch.randn((1, 3, 256, 256)), )

pre_autograd_aten_dialect = capture_pre_autograd_graph(model, example_inputs)

quantization_config = LinearQuantizerConfig.from_dict(
{
"global_config": {
"quantization_scheme": QuantizationScheme.symmetric,
"activation_dtype": torch.uint8,
"weight_dtype": torch.int8,
"weight_per_channel": True,
}
}
)
quantizer = CoreMLQuantizer(quantization_config)

# For post-training quantization, use `prepare_pt2e`
# For quantization-aware trainin,g use `prepare_qat_pt2e`
prepared_graph = prepare_pt2e(pre_autograd_aten_dialect, quantizer)

prepared_graph(*example_inputs)
converted_graph = convert_pt2e(prepared_graph)
```

The `converted_graph` is the quantized torch model, and can be delegated to **Core ML** similarly through **CoreMLPartitioner**

## Runtime

To execute a **Core ML** delegated **Program**, the client must link to the `coremldelegate` library. Once linked there are no additional steps required, **ExecuTorch** when running the **Program** would call the **Core ML** runtime to execute the **Core ML** delegated part of the **Program**.
To execute a Core ML delegated program, the application must link to the `coremldelegate` library. Once linked there are no additional steps required, ExecuTorch when running the program would call the Core ML runtime to execute the Core ML delegated part of the program.

Please follow the instructions described in the [Core ML setup](/backends/apple/coreml/setup.md) to link the `coremldelegate` library.

## Help & Improvements
If you have problems or questions or have suggestions for ways to make
implementation and testing better, please create an issue on [github](https://www.github.com/pytorch/executorch/issues).
16 changes: 8 additions & 8 deletions backends/apple/coreml/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ python3 -m examples.apple.coreml.scripts.export --model_name add
4. You can now integrate the **Core ML** backend in code.

```python
# Lower to Core ML backend
lowered_module = to_backend('CoreMLBackend', to_be_lowered_exir_submodule, [])
# Delegate to Core ML backend
delegated_program_manager = edge_program_manager.to_backend(CoreMLPartitioner())
```


Expand All @@ -46,15 +46,15 @@ lowered_module = to_backend('CoreMLBackend', to_be_lowered_exir_submodule, [])
xcode-select --install
```

2. Build **Core ML** delegate. The following will create a `executorch.xcframework` in `cmake-out` directory.
4. Build **Core ML** delegate. The following will create `executorch.xcframework` and `coreml_backend.xcframework` in the `cmake-out` directory.

```bash
cd executorch
./build/build_apple_frameworks.sh --Release --coreml
```
3. Open the project in Xcode, and drag the `executorch.xcframework` generated from Step 2 to Frameworks.
5. Open the project in Xcode, and drag `executorch.xcframework` and `coreml_backend.xcframework` frameworks generated from Step 2 to Frameworks.

4. Go to project Target’s Build Phases - Link Binaries With Libraries, click the + sign, and add the following frameworks:
6. Go to project Target’s Build Phases - Link Binaries With Libraries, click the + sign, and add the following frameworks:

```
executorch.xcframework
Expand All @@ -63,9 +63,9 @@ coreml_backend.xcframework

5. Go to project Target’s Build Phases - Link Binaries With Libraries, click the + sign, and add the following frameworks.
```
- Accelerate.framework
- CoreML.framework
- libsqlite3.tbd
Accelerate.framework
CoreML.framework
libsqlite3.tbd
```

6. The target could now run a **Core ML** delegated **Program**.
63 changes: 46 additions & 17 deletions docs/source/build-run-coreml.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Building and Running ExecuTorch with Core ML Backend

Core ML delegate uses Core ML apis to enable running neural networks via Apple's hardware acceleration. For more about coreml you can read [here](https://developer.apple.com/documentation/coreml). In this tutorial we will walk through steps of lowering a PyTorch model to Core ML delegate
Core ML delegate uses Core ML APIs to enable running neural networks via Apple's hardware acceleration. For more about coreml you can read [here](https://developer.apple.com/documentation/coreml). In this tutorial, we will walk through the steps of lowering a PyTorch model to Core ML delegate


::::{grid} 2
Expand Down Expand Up @@ -67,22 +67,53 @@ python3 -m examples.apple.coreml.scripts.export --model_name mv3

### Runtime:

**Running the Core ML delegated Program**:
**Running a Core ML delegated Program**:
1. Build the runner.
```bash
cd executorch

# Generates ./coreml_executor_runner.
# Builds `coreml_executor_runner`.
./examples/apple/coreml/scripts/build_executor_runner.sh
```
2. Run the exported program.
2. Run the CoreML delegated program.
```bash
cd executorch

# Runs the exported mv3 model on the Core ML backend.
# Runs the exported mv3 model using the Core ML backend.
./coreml_executor_runner --model_path mv3_coreml_all.pte
```

**Profiling a Core ML delegated Program**:

Note that profiling is supported on [macOS](https://developer.apple.com/macos) >= 14.4.

1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) when exporting your model.
```bash
cd executorch

# Generates `mv3_coreml_all.pte` and `mv3_coreml_etrecord.bin` files.
python3 -m examples.apple.coreml.scripts.export --model_name mv3 --generate_etrecord
```

2. Build the runner.
```bash
# Builds `coreml_executor_runner`.
./examples/apple/coreml/scripts/build_executor_runner.sh
```
3. Run and generate an [ETDump](./sdk-etdump.md).
```bash
cd executorch

# Generate the ETDump file.
./coreml_executor_runner --model_path mv3_coreml_all.pte --profile_model --etdump_path etdump.etdp
```

4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the [ETDump](./sdk-etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./sdk-etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table.
```bash
python examples/apple/coreml/scripts/inspector_cli.py --etdump_path etdump.etdp --etrecord_path mv3_coreml.bin
```


## Deploying and running on a device

**Running the Core ML delegated Program in the Demo iOS App**:
Expand All @@ -92,37 +123,35 @@ cd executorch

3. Complete the [Final Steps](demo-apps-ios.md#final-steps) section of the tutorial to build and run the demo app.

<br>**Running the Core ML delegated Program in your own App**
1. Build **Core ML** delegate. The following will create a `executorch.xcframework` in the `cmake-out` directory.
<br>**Running the Core ML delegated Program in your App**
1. Build frameworks, running the following will create a `executorch.xcframework` and `coreml_backend.xcframework` in the `cmake-out` directory.
```bash
cd executorch
./build/build_apple_frameworks.sh --Release --coreml
```
2. Create a new [Xcode project](https://developer.apple.com/documentation/xcode/creating-an-xcode-project-for-an-app#) or open an existing project.

3. Drag the `executorch.xcframework` generated from Step 2 to Frameworks.
3. Drag the `executorch.xcframework` and `coreml_backend.xcframework` generated from Step 2 to Frameworks.

4. Go to the project's [Build Phases](https://developer.apple.com/documentation/xcode/customizing-the-build-phases-of-a-target) - Link Binaries With Libraries, click the + sign, and add the following frameworks:
```
- executorch.xcframework
- coreml_backend.xcframework
- Accelerate.framework
- CoreML.framework
- libsqlite3.tbd
executorch.xcframework
coreml_backend.xcframework
Accelerate.framework
CoreML.framework
libsqlite3.tbd
```
5. Add the exported program to the [Copy Bundle Phase](https://developer.apple.com/documentation/xcode/customizing-the-build-phases-of-a-target#Copy-files-to-the-finished-product) of your Xcode target.

6. Please follow the [running a model](running-a-model-cpp-tutorial.md) tutorial to integrate the code for loading a ExecuTorch program.
6. Please follow the [running a model](./running-a-model-cpp-tutorial.md) tutorial to integrate the code for loading an ExecuTorch program.

7. Update the code to load the program from the Application's bundle.
``` objective-c
using namespace torch::executor;

NSURL *model_url = [NBundle.mainBundle URLForResource:@"mv3_coreml_all" extension:@"pte"];

Result<util::FileDataLoader> loader =
util::FileDataLoader::from(model_url.path.UTF8String);

Result<util::FileDataLoader> loader = util::FileDataLoader::from(model_url.path.UTF8String);
```

8. Use [Xcode](https://developer.apple.com/documentation/xcode/building-and-running-an-app#Build-run-and-debug-your-app) to deploy the application on the device.
Expand Down
26 changes: 18 additions & 8 deletions examples/apple/coreml/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Examples

This directory contains scripts and other helper utilities to illustrate an end-to-end workflow to run a **Core ML** delegated `torch.nn.module` with the **ExecuTorch** runtime.
This directory contains scripts and other helper utilities to illustrate an end-to-end workflow to run a Core ML delegated `torch.nn.module` with the ExecuTorch runtime.


## Directory structure
Expand All @@ -13,7 +13,7 @@ coreml

## Using the examples

We will walk through an example model to generate a **Core ML** delegated binary file from a python `torch.nn.module` then we will use the `coreml/executor_runner` to run the exported binary file.
We will walk through an example model to generate a Core ML delegated binary file from a python `torch.nn.module` then we will use the `coreml_executor_runner` to run the exported binary file.

1. Following the setup guide in [Setting Up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup)
you should be able to get the basic development environment for ExecuTorch working.
Expand All @@ -27,40 +27,50 @@ cd executorch

```

3. Run the export script to generate a **Core ML** delegated binary file.
3. Run the export script to generate a Core ML delegated binary file.

```bash
cd executorch

# To get a list of example models
python3 -m examples.portable.scripts.export -h

# Generates ./add_coreml_all.pte file if successful.
# Generates add_coreml_all.pte file if successful.
python3 -m examples.apple.coreml.scripts.export --model_name add
```

4. Once we have the **Core ML** delegated model binary (pte) file, then let's run it with the **ExecuTorch** runtime using the `coreml_executor_runner`.
4. Run the binary file using the `coreml_executor_runner`.

```bash
cd executorch

# Builds the Core ML executor runner. Generates ./coreml_executor_runner if successful.
./examples/apple/coreml/scripts/build_executor_runner.sh

# Run the Core ML delegate model.
# Run the delegated model.
./coreml_executor_runner --model_path add_coreml_all.pte
```

## Frequently encountered errors and resolution.
- The `examples.apple.coreml.scripts.export` could fail if the model is not supported by the **Core ML** backend. The following models from the examples models list (` python3 -m examples.portable.scripts.export -h`)are currently supported by the **Core ML** backend.
- The `examples.apple.coreml.scripts.export` could fail if the model is not supported by the Core ML backend. The following models from the examples models list (` python3 -m examples.portable.scripts.export -h`) are currently supported by the Core ML backend.

```
```text
add
add_mul
dl3
edsr
emformer_join
emformer_predict
emformer_transcribe
ic3
ic4
linear
llama2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried llama3? If llama2 works, llama3 should be just out of box.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, going to try today.

llava_encoder
mobilebert
mul
mv2
mv2_untrained
mv3
resnet18
resnet50
Expand Down
Loading