Skip to content

Commit a85e6e3

Browse files
Add docs on Module extension. (#3798) (#3807)
Summary: Pull Request resolved: #3798 overriding_review_checks_triggers_an_audit_and_retroactive_review Oncall Short Name: executorch Differential Revision: D58065736 fbshipit-source-id: 2d61bbaa7ad6a18f7a4a81d62246b14cbb8f8d02 (cherry picked from commit 13ba3a7) Co-authored-by: Anthony Shoumikhin <[email protected]>
1 parent 50d1da2 commit a85e6e3

8 files changed

+170
-6
lines changed

docs/source/build-run-coreml.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ libsqlite3.tbd
143143
```
144144
5. Add the exported program to the [Copy Bundle Phase](https://developer.apple.com/documentation/xcode/customizing-the-build-phases-of-a-target#Copy-files-to-the-finished-product) of your Xcode target.
145145

146-
6. Please follow the [running a model](./running-a-model-cpp-tutorial.md) tutorial to integrate the code for loading an ExecuTorch program.
146+
6. Please follow the [Runtime APIs Tutorial](extension-module.md) to integrate the code for loading an ExecuTorch program.
147147

148148
7. Update the code to load the program from the Application's bundle.
149149
``` objective-c

docs/source/executorch-runtime-api-reference.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ ExecuTorch Runtime API Reference
44
The ExecuTorch C++ API provides an on-device execution framework for exported PyTorch models.
55

66
For a tutorial style introduction to the runtime API, check out the
7-
`runtime api tutorial <running-a-model-cpp-tutorial.html>`__.
7+
`runtime tutorial <running-a-model-cpp-tutorial.html>`__ and its `simplified <extension-module.html>`__ version.
88

99
Model Loading and Execution
1010
---------------------------

docs/source/extension-module.md

+155
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# Running an ExecuTorch Model Using the Module Extension in C++
2+
3+
**Author:** [Anthony Shoumikhin](https://github.com/shoumikhin)
4+
5+
In the [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md), we explored the lower-level ExecuTorch APIs for running an exported model. While these APIs offer zero overhead, great flexibility, and control, they can be verbose and complex for regular use. To simplify this and resemble PyTorch's eager mode in Python, we introduce the Module facade APIs over the regular ExecuTorch runtime APIs. The Module APIs provide the same flexibility but default to commonly used components like `DataLoader` and `MemoryAllocator`, hiding most intricate details.
6+
7+
## Example
8+
9+
Let's see how we can run the `SimpleConv` model generated from the [Exporting to ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial) using the `Module` APIs:
10+
11+
```cpp
12+
#include <executorch/extension/module/module.h>
13+
14+
using namespace ::torch::executor;
15+
16+
// Create a Module.
17+
Module module("/path/to/model.pte");
18+
19+
// Wrap the input data with a Tensor.
20+
float input[1 * 3 * 256 * 256];
21+
Tensor::SizesType sizes[] = {1, 3, 256, 256};
22+
TensorImpl tensor(ScalarType::Float, std::size(sizes), sizes, input);
23+
24+
// Perform an inference.
25+
const auto result = module.forward({EValue(Tensor(&tensor))});
26+
27+
// Check for success or failure.
28+
if (result.ok()) {
29+
// Retrieve the output data.
30+
const auto output = result->at(0).toTensor().const_data_ptr<float>();
31+
}
32+
```
33+
34+
The code now boils down to creating a `Module` and calling `forward()` on it, with no additional setup. Let's take a closer look at these and other `Module` APIs to better understand the internal workings.
35+
36+
## APIs
37+
38+
### Creating a Module
39+
40+
Creating a `Module` object is an extremely fast operation that does not involve significant processing time or memory allocation. The actual loading of a `Program` and a `Method` happens lazily on the first inference unless explicitly requested with a dedicated API.
41+
42+
```cpp
43+
Module module("/path/to/model.pte");
44+
```
45+
46+
### Force-Loading a Method
47+
48+
To force-load the `Module` (and thus the underlying ExecuTorch `Program`) at any time, use the `load()` function:
49+
50+
```cpp
51+
const auto error = module.load();
52+
53+
assert(module.is_loaded());
54+
```
55+
56+
To force-load a particular `Method`, call the `load_method()` function:
57+
58+
```cpp
59+
const auto error = module.load_method("forward");
60+
61+
assert(module.is_method_loaded("forward"));
62+
```
63+
Note: the `Program` is loaded automatically before any `Method` is loaded. Subsequent attemps to load them have no effect if one of the previous attemps was successful.
64+
65+
### Querying for Metadata
66+
67+
Get a set of method names that a Module contains udsing the `method_names()` function:
68+
69+
```cpp
70+
const auto method_names = module.method_names();
71+
72+
if (method_names.ok()) {
73+
assert(method_names.count("forward"));
74+
}
75+
```
76+
77+
Note: `method_names()` will try to force-load the `Program` when called the first time.
78+
79+
Introspect miscellaneous metadata about a particular method via `MethodMeta` struct returned by `method_meta()` function:
80+
81+
```cpp
82+
const auto method_meta = module.method_meta("forward");
83+
84+
if (method_meta.ok()) {
85+
assert(method_meta->name() == "forward");
86+
assert(method_meta->num_inputs() > 1);
87+
88+
const auto input_meta = method_meta->input_tensor_meta(0);
89+
90+
if (input_meta.ok()) {
91+
assert(input_meta->scalar_type() == ScalarType::Float);
92+
}
93+
const auto output_meta = meta->output_tensor_meta(0);
94+
95+
if (output_meta.ok()) {
96+
assert(output_meta->sizes().size() == 1);
97+
}
98+
}
99+
```
100+
101+
Note: `method_meta()` will try to force-load the `Method` when called for the first time.
102+
103+
### Perform an Inference
104+
105+
Assuming that the `Program`'s method names and their input format is known ahead of time, we rarely need to query for those and can run the methods directly by name using the `execute()` function:
106+
107+
```cpp
108+
const auto result = module.execute("forward", {EValue(Tensor(&tensor))});
109+
```
110+
111+
Which can also be simplified for the standard `forward()` method name as:
112+
113+
```cpp
114+
const auto result = module.forward({EValue(Tensor(&tensor))});
115+
```
116+
117+
Note: `execute()` or `forward()` will try to force load the `Program` and the `Method` when called for the first time. Therefore, the first inference will take more time than subsequent ones as it loads the model lazily and prepares it for execution unless the `Program` or `Method` was loaded explicitly earlier using the corresponding functions.
118+
119+
### Result and Error Types
120+
121+
Most of the ExecuTorch APIs, including those described above, return either `Result` or `Error` types. Let's understand what those are:
122+
123+
* [`Error`](https://github.com/pytorch/executorch/blob/main/runtime/core/error.h) is a C++ enum containing a collection of valid error codes, where the default is `Error::Ok`, denoting success.
124+
125+
* [`Result`](https://github.com/pytorch/executorch/blob/main/runtime/core/result.h) can hold either an `Error` if the operation has failed or a payload, i.e., the actual result of the operation like an `EValue` wrapping a `Tensor` or any other standard C++ data type if the operation succeeded. To check if `Result` has a valid value, call the `ok()` function. To get the `Error` use the `error()` function, and to get the actual data, use the overloaded `get()` function or dereferencing pointer operators like `*` and `->`.
126+
127+
### Profile the Module
128+
129+
Use [ExecuTorch Dump](sdk-etdump.md) to trace model execution. Create an instance of the `ETDumpGen` class and pass it to the `Module` constructor. After executing a method, save the `ETDump` to a file for further analysis. You can capture multiple executions in a single trace if desired.
130+
131+
```cpp
132+
#include <fstream>
133+
#include <memory>
134+
#include <executorch/extension/module/module.h>
135+
#include <executorch/sdk/etdump/etdump_flatcc.h>
136+
137+
using namespace ::torch::executor;
138+
139+
Module module("/path/to/model.pte", Module::MlockConfig::UseMlock, std::make_unique<ETDumpGen>());
140+
141+
// Execute a method, e.g. module.forward(...); or module.execute("my_method", ...);
142+
143+
if (auto* etdump = dynamic_cast<ETDumpGen*>(module.event_tracer())) {
144+
const auto trace = etdump->get_etdump_data();
145+
146+
if (trace.buf && trace.size > 0) {
147+
std::unique_ptr<void, decltype(&free)> guard(trace.buf, free);
148+
std::ofstream file("/path/to/trace.etdump", std::ios::binary);
149+
150+
if (file) {
151+
file.write(static_cast<const char*>(trace.buf), trace.size);
152+
}
153+
}
154+
}
155+
```

docs/source/getting-started-setup.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ Output 0: tensor(sizes=[1], [2.])
183183
```
184184
:::
185185

186-
To learn how to build a similar program, visit the [ExecuTorch in C++ Tutorial](running-a-model-cpp-tutorial.md).
186+
To learn how to build a similar program, visit the [Runtime APIs Tutorial](extension-module.md).
187187

188188
### [Optional] Setting Up Buck2
189189
**Buck2** is an open-source build system that some of our examples currently utilize for building and running.

docs/source/index.rst

+8
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ Topics in this section will help you get started with ExecuTorch.
9393

9494
tutorials/export-to-executorch-tutorial
9595
running-a-model-cpp-tutorial
96+
extension-module
9697
tutorials/sdk-integration-tutorial
9798
demo-apps-ios
9899
demo-apps-android
@@ -225,6 +226,13 @@ ExecuTorch tutorials.
225226
:link: running-a-model-cpp-tutorial.html
226227
:tags:
227228

229+
.. customcarditem::
230+
:header: Simplified Runtime APIs Tutorial
231+
:card_description: A simplified tutorial for executing the model on device.
232+
:image: _static/img/generic-pytorch-logo.png
233+
:link: extension-module.html
234+
:tags:
235+
228236
.. customcarditem::
229237
:header: Using the ExecuTorch SDK to Profile a Model
230238
:card_description: A tutorial for using the ExecuTorch SDK to profile and analyze a model with linkage back to source code.

docs/source/llm/getting-started.md

+1-2
Original file line numberDiff line numberDiff line change
@@ -344,8 +344,7 @@ curl -O https://raw.githubusercontent.com/pytorch/executorch/main/examples/llm_m
344344
curl -O https://raw.githubusercontent.com/pytorch/executorch/main/examples/llm_manual/managed_tensor.h
345345
```
346346

347-
To learn more, see [Running an ExecuTorch Model in C++](../running-a-model-cpp-tutorial.md)
348-
and the [ExecuTorch Runtime API Reference](../executorch-runtime-api-reference.md).
347+
To learn more, see the [Runtime APIs Tutorial](../extension-module.md).
349348

350349
### Building and Running
351350

docs/source/running-a-model-cpp-tutorial.md

+1
Original file line numberDiff line numberDiff line change
@@ -143,3 +143,4 @@ assert(output.isTensor());
143143
## Conclusion
144144

145145
In this tutorial, we went over the APIs and steps required to load and perform an inference with an ExecuTorch model in C++.
146+
Also, check out the [Simplified Runtime APIs Tutorial](extension-module.md).

docs/source/runtime-overview.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,8 @@ However, please note:
156156

157157
For more details about the ExecuTorch runtime, please see:
158158

159-
* [Runtime API Tutorial](running-a-model-cpp-tutorial.md)
159+
* [Detailed Runtime APIs Tutorial](running-a-model-cpp-tutorial.md)
160+
* [Simplified Runtime APIs Tutorial](extension-module.md)
160161
* [Runtime Build and Cross Compilation](runtime-build-and-cross-compilation.md)
161162
* [Runtime Platform Abstraction Layer](runtime-platform-abstraction-layer.md)
162163
* [Runtime Profiling](sdk-profiling.md)

0 commit comments

Comments
 (0)