Skip to content

Use relative links in llm/getting-started.md #3310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 18 additions & 23 deletions docs/source/llm/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ cd ../..
:::
::::

For more information, see [Setting Up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup.html).
For more information, see [Setting Up ExecuTorch](../getting-started-setup.md).


## Running a Large Language Model Locally
Expand Down Expand Up @@ -161,7 +161,7 @@ with open("nanogpt.pte", "wb") as file:

To export, run the script with `python export_nanogpt.py` (or python3, as appropriate for your environment). It will generate a `nanogpt.pte` file in the current directory.

For more information, see [Exporting to ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial.html) and
For more information, see [Exporting to ExecuTorch](../tutorials/export-to-executorch-tutorial) and
[torch.export](https://pytorch.org/docs/stable/export.html).

### Step 2. Invoking the Runtime
Expand Down Expand Up @@ -305,8 +305,8 @@ curl -O https://raw.githubusercontent.com/GregoryComer/et-tutorials/quantization
curl -O https://raw.githubusercontent.com/GregoryComer/et-tutorials/quantization/nanogpt/basic_sampler.h
```

To learn more, see [Running an ExecuTorch Model in C++](https://pytorch.org/executorch/main/running-a-model-cpp-tutorial.html)
and the [ExecuTorch Runtime API Reference](https://pytorch.org/executorch/main/executorch-runtime-api-reference.html).
To learn more, see [Running an ExecuTorch Model in C++](../running-a-model-cpp-tutorial.md)
and the [ExecuTorch Runtime API Reference](../executorch-runtime-api-reference.md).

### Building and Running

Expand Down Expand Up @@ -481,11 +481,9 @@ target_link_libraries(
xnnpack_backend) # Provides the XNNPACK CPU acceleration backend
```

Keep the rest of the code the same. For more details refer to
[Exporting to ExecuTorch](https://pytorch.org/executorch/main/llm/getting-started.html#step-1-exporting-to-executorch)
and
[Invoking the Runtime](https://pytorch.org/executorch/main/llm/getting-started.html#step-2-invoking-the-runtime)
for more details
Keep the rest of the code the same. For more details refer to [Exporting
to ExecuTorch](#step-1-exporting-to-executorch) and [Invoking the
Runtime](#step-2-invoking-the-runtime) for more details

At this point, the working directory should contain the following files:

Expand Down Expand Up @@ -520,10 +518,8 @@ Once upon a time, there was a man who was a member of the military...


For more information regarding backend delegateion, see the ExecuTorch guides
for the
[XNNPACK Backend](https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html)
and
[CoreML Backend](https://pytorch.org/executorch/stable/build-run-coreml.html).
for the [XNNPACK Backend](../tutorial-xnnpack-delegate-lowering.md) and [CoreML
Backend](../build-run-coreml.md).

## Quantization

Expand Down Expand Up @@ -609,7 +605,7 @@ target_link_libraries(
xnnpack_backend) # Provides the XNNPACK CPU acceleration backend
```

For more information, see [Quantization in ExecuTorch](https://pytorch.org/executorch/stable/quantization-overview.html).
For more information, see [Quantization in ExecuTorch](../quantization-overview.md).

## Profiling and Debugging
After lowering a model by calling `to_backend()`, you may want to see what got delegated and what didn’t. ExecuTorch
Expand Down Expand Up @@ -687,7 +683,7 @@ Through the ExecuTorch SDK, users are able to profile model execution, giving ti

##### ETRecord generation (Optional)

An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [https://pytorch.org/executorch/main/sdk-etrecord.html](https://pytorch.org/executorch/main/sdk-etrecord.html)
An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [the ETRecord docs](../sdk-etrecord.md).


In your export script, after calling `to_edge()` and `to_executorch()`, call `generate_etrecord()` with the `EdgeProgramManager` from `to_edge()` and the `ExecuTorchProgramManager` from `to_executorch()`. Make sure to copy the `EdgeProgramManager`, as the call to `to_backend()` mutates the graph in-place.
Expand All @@ -709,7 +705,7 @@ Run the export script and the ETRecord will be generated as `etrecord.bin`.

##### ETDump generation

An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [https://pytorch.org/executorch/main/sdk-etdump.html](https://pytorch.org/executorch/main/sdk-etdump.html)
An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../sdk-etdump.md).

Include the ETDump header in your code.
```cpp
Expand Down Expand Up @@ -779,7 +775,7 @@ This prints the performance data in a tabular format in “inspector_out.txt”,
![](../_static/img/llm_manual_print_data_tabular.png)
<a href="../_static/img/llm_manual_print_data_tabular.png" target="_blank">View in full size</a>

To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](https://pytorch.org/executorch/main/sdk-inspector.html).
To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../sdk-inspector.md).

## Custom Kernels
With the ExecuTorch custom operator APIs, custom operator and kernel authors can easily bring in their kernel into PyTorch/ExecuTorch.
Expand Down Expand Up @@ -857,7 +853,7 @@ torch.ops.load_library("libcustom_linear.so")
Once loaded, you can use the custom operator in PyTorch code.

For more information, see [PyTorch Custom Operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html) and
and [ExecuTorch Kernel Registration](https://pytorch.org/executorch/stable/kernel-library-custom-aten-kernel.html).
and [ExecuTorch Kernel Registration](../kernel-library-custom-aten-kernel.md).

### Using a Custom Operator in a Model

Expand All @@ -879,9 +875,8 @@ def replace_linear_with_custom_linear(module):

The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch.

## How to build Mobile Apps
You can execute an LLM using ExecuTorch on iOS and Android.
## How to Build Mobile Apps
See the instructions for building and running LLMs using ExecuTorch on iOS and Android.

**For iOS see the [iLLaMA App](https://pytorch.org/executorch/main/llm/llama-demo-ios.html).**

**For Android, see the [Android Sample App](https://pytorch.org/executorch/main/llm/llama-demo-android.html).**
* **[iOS ExecuTorch LLaMA Demo App](llama-demo-ios.md)**
* **[Android ExecuTorch LLaMA Demo App](llama-demo-android.md)**
Loading