You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -237,6 +232,8 @@ We will use a couple of simple PyTorch Modules to explore the end-to-end flow. T
237
232
This is a very simple PyTorch module with just one [Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax) operator.
238
233
239
234
```python
235
+
import torch
236
+
240
237
classSoftmaxModule(torch.nn.Module):
241
238
def__init__(self):
242
239
super().__init__()
@@ -286,8 +283,8 @@ We need to be aware of data types for running networks on the Ethos-U55 as it is
286
283
In the ExecuTorch AoT pipeline, one of the options is to select a backend. ExecuTorch offers a variety of different backends. Selecting backend is optional, it is typically done to target a particular mode of acceleration or hardware for a given model compute requirements. Without any backends, ExecuTorch runtime will fallback to using, available by default, a highly portable set of operators.
287
284
288
285
It's expected that on platforms with dedicated acceleration like the Ethos-U55, that the non-delegated flow is used for two primary cases:
289
-
1. When the network is designed to be very small and best suited to run on the Cortex-M alone
290
-
1. When the network has a mix of operations that can target the NPU and those that can't, e.g. the Ethos-U55 supports integer operations and so floating point softmax will fall back to execute on the CPU
286
+
1. When the network is designed to be very small and best suited to run on the Cortex-M alone.
287
+
2. When the network has a mix of operations that can target the NPU and those that can't, e.g. the Ethos-U55 supports integer operations and so floating point softmax will fall back to execute on the CPU.
291
288
292
289
In this flow, without any backend delegates, to illustrate the portability of the ExecuTorch runtime, as well as of the operator library we will skip specifying the backend during the `.pte` generation.
293
290
@@ -305,7 +302,11 @@ Working with Arm, we introduced a new Arm backend delegate for ExecuTorch. This
305
302
By including a following step during the ExecuTorch AoT export pipeline to generate the `.pte` file, we can enable this backend delegate.
Similar to the non-delegate flow, the same script will server as a helper utility to help us generate the `.pte` file. Notice the `--delegate` option to enable the `to_backend` call.
@@ -352,6 +353,7 @@ To generate these libraries, use following commands,
352
353
# Empty and already created
353
354
cd<executorch_source_root_dir>
354
355
356
+
# Use provided cmake toolchain for bare-metal builds
@@ -488,8 +494,8 @@ Info: Simulation is stopping. Reason: CPU time has been exceeded.
488
494
Through this tutorial we've learnt how to use the ExecuTorch software to both export a standard model from PyTorch and to run it on the compact and fully functioned ExecuTorch runtime, enabling a smooth path for offloading models from PyTorch to Arm based platforms.
489
495
490
496
To recap, there are two major flows:
491
-
* A direct flow which offloads work onto the Cortex-M using libraries built into ExecuTorch
492
-
* A delegated flow which partitions the graph into sections for Cortex-M and sections which can be offloaded and accelerated on the Ethos-U hardware
497
+
* A direct flow which offloads work onto the Cortex-M using libraries built into ExecuTorch.
498
+
* A delegated flow which partitions the graph into sections for Cortex-M and sections which can be offloaded and accelerated on the Ethos-U hardware.
493
499
494
500
Both of these flows continue to evolve, enabling more use-cases and better performance.
0 commit comments