diff --git a/docs/source/llm/getting-started.md b/docs/source/llm/getting-started.md index 3bff5c903f8..1304636e9c6 100644 --- a/docs/source/llm/getting-started.md +++ b/docs/source/llm/getting-started.md @@ -77,7 +77,7 @@ cd ../.. ::: :::: -For more information, see [Setting Up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup.html). +For more information, see [Setting Up ExecuTorch](../getting-started-setup.md). ## Running a Large Language Model Locally @@ -161,7 +161,7 @@ with open("nanogpt.pte", "wb") as file: To export, run the script with `python export_nanogpt.py` (or python3, as appropriate for your environment). It will generate a `nanogpt.pte` file in the current directory. -For more information, see [Exporting to ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial.html) and +For more information, see [Exporting to ExecuTorch](../tutorials/export-to-executorch-tutorial) and [torch.export](https://pytorch.org/docs/stable/export.html). ### Step 2. Invoking the Runtime @@ -305,8 +305,8 @@ curl -O https://raw.githubusercontent.com/GregoryComer/et-tutorials/quantization curl -O https://raw.githubusercontent.com/GregoryComer/et-tutorials/quantization/nanogpt/basic_sampler.h ``` -To learn more, see [Running an ExecuTorch Model in C++](https://pytorch.org/executorch/main/running-a-model-cpp-tutorial.html) -and the [ExecuTorch Runtime API Reference](https://pytorch.org/executorch/main/executorch-runtime-api-reference.html). +To learn more, see [Running an ExecuTorch Model in C++](../running-a-model-cpp-tutorial.md) +and the [ExecuTorch Runtime API Reference](../executorch-runtime-api-reference.md). ### Building and Running @@ -481,11 +481,9 @@ target_link_libraries( xnnpack_backend) # Provides the XNNPACK CPU acceleration backend ``` -Keep the rest of the code the same. For more details refer to -[Exporting to ExecuTorch](https://pytorch.org/executorch/main/llm/getting-started.html#step-1-exporting-to-executorch) -and -[Invoking the Runtime](https://pytorch.org/executorch/main/llm/getting-started.html#step-2-invoking-the-runtime) -for more details +Keep the rest of the code the same. For more details refer to [Exporting +to ExecuTorch](#step-1-exporting-to-executorch) and [Invoking the +Runtime](#step-2-invoking-the-runtime) for more details At this point, the working directory should contain the following files: @@ -520,10 +518,8 @@ Once upon a time, there was a man who was a member of the military... For more information regarding backend delegateion, see the ExecuTorch guides -for the -[XNNPACK Backend](https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html) -and -[CoreML Backend](https://pytorch.org/executorch/stable/build-run-coreml.html). +for the [XNNPACK Backend](../tutorial-xnnpack-delegate-lowering.md) and [CoreML +Backend](../build-run-coreml.md). ## Quantization @@ -609,7 +605,7 @@ target_link_libraries( xnnpack_backend) # Provides the XNNPACK CPU acceleration backend ``` -For more information, see [Quantization in ExecuTorch](https://pytorch.org/executorch/stable/quantization-overview.html). +For more information, see [Quantization in ExecuTorch](../quantization-overview.md). ## Profiling and Debugging After lowering a model by calling `to_backend()`, you may want to see what got delegated and what didn’t. ExecuTorch @@ -687,7 +683,7 @@ Through the ExecuTorch SDK, users are able to profile model execution, giving ti ##### ETRecord generation (Optional) -An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [https://pytorch.org/executorch/main/sdk-etrecord.html](https://pytorch.org/executorch/main/sdk-etrecord.html) +An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [the ETRecord docs](../sdk-etrecord.md). In your export script, after calling `to_edge()` and `to_executorch()`, call `generate_etrecord()` with the `EdgeProgramManager` from `to_edge()` and the `ExecuTorchProgramManager` from `to_executorch()`. Make sure to copy the `EdgeProgramManager`, as the call to `to_backend()` mutates the graph in-place. @@ -709,7 +705,7 @@ Run the export script and the ETRecord will be generated as `etrecord.bin`. ##### ETDump generation -An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [https://pytorch.org/executorch/main/sdk-etdump.html](https://pytorch.org/executorch/main/sdk-etdump.html) +An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../sdk-etdump.md). Include the ETDump header in your code. ```cpp @@ -779,7 +775,7 @@ This prints the performance data in a tabular format in “inspector_out.txt”, ![](../_static/img/llm_manual_print_data_tabular.png) View in full size -To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](https://pytorch.org/executorch/main/sdk-inspector.html). +To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../sdk-inspector.md). ## Custom Kernels With the ExecuTorch custom operator APIs, custom operator and kernel authors can easily bring in their kernel into PyTorch/ExecuTorch. @@ -857,7 +853,7 @@ torch.ops.load_library("libcustom_linear.so") Once loaded, you can use the custom operator in PyTorch code. For more information, see [PyTorch Custom Operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html) and -and [ExecuTorch Kernel Registration](https://pytorch.org/executorch/stable/kernel-library-custom-aten-kernel.html). +and [ExecuTorch Kernel Registration](../kernel-library-custom-aten-kernel.md). ### Using a Custom Operator in a Model @@ -879,9 +875,8 @@ def replace_linear_with_custom_linear(module): The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch. -## How to build Mobile Apps -You can execute an LLM using ExecuTorch on iOS and Android. +## How to Build Mobile Apps +See the instructions for building and running LLMs using ExecuTorch on iOS and Android. -**For iOS see the [iLLaMA App](https://pytorch.org/executorch/main/llm/llama-demo-ios.html).** - -**For Android, see the [Android Sample App](https://pytorch.org/executorch/main/llm/llama-demo-android.html).** +* **[iOS ExecuTorch LLaMA Demo App](llama-demo-ios.md)** +* **[Android ExecuTorch LLaMA Demo App](llama-demo-android.md)**