Use relative links in llm/getting-started.md (#3244) (#3310)

pytorchbot · dbort · web-flow · commit 861abb16988c · 2024-04-24T10:50:08.000-07:00
Summary: Use relative markdown links instead of full URLs. This way, the docs will always point to a consistent branch. Pull Request resolved: #3244 Test Plan: Clicked on all modified links in the rendered docs preview: https://docs-preview.pytorch.org/pytorch/executorch/3244/llm/getting-started.html Reviewed By: Gasoonjia Differential Revision: D56479234 Pulled By: dbort fbshipit-source-id: 45fb25f017c73df8606c3fb861acafbdd82fec8c (cherry picked from commit b560864) Co-authored-by: Dave Bort <dbort@meta.com>
diff --git a/docs/source/llm/getting-started.md b/docs/source/llm/getting-started.md
@@ -90,7 +90,7 @@ cd ../..
 :::
 ::::
 
-For more information, see [Setting Up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup.html).
+For more information, see [Setting Up ExecuTorch](../getting-started-setup.md).
 
 
 ## Running a Large Language Model Locally
@@ -185,7 +185,7 @@ with open("nanogpt.pte", "wb") as file:
 
 To export, run the script with `python export_nanogpt.py` (or python3, as appropriate for your environment). It will generate a `nanogpt.pte` file in the current directory.
 
-For more information, see [Exporting to ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial.html) and
+For more information, see [Exporting to ExecuTorch](../tutorials/export-to-executorch-tutorial) and
 [torch.export](https://pytorch.org/docs/stable/export.html).
 
 ### Step 2. Invoking the Runtime
@@ -344,8 +344,8 @@ curl -O https://raw.githubusercontent.com/pytorch/executorch/main/examples/llm_m
 curl -O https://raw.githubusercontent.com/pytorch/executorch/main/examples/llm_manual/managed_tensor.h
 ```
 
-To learn more, see [Running an ExecuTorch Model in C++](https://pytorch.org/executorch/main/running-a-model-cpp-tutorial.html)
-and the [ExecuTorch Runtime API Reference](https://pytorch.org/executorch/main/executorch-runtime-api-reference.html).
+To learn more, see [Running an ExecuTorch Model in C++](../running-a-model-cpp-tutorial.md)
+and the [ExecuTorch Runtime API Reference](../executorch-runtime-api-reference.md).
 
 ### Building and Running
 
@@ -541,11 +541,9 @@ target_link_libraries(
     xnnpack_backend) # Provides the XNNPACK CPU acceleration backend
 ```
 
-Keep the rest of the code the same. For more details refer to
-[Exporting to ExecuTorch](https://pytorch.org/executorch/main/llm/getting-started.html#step-1-exporting-to-executorch)
-and
-[Invoking the Runtime](https://pytorch.org/executorch/main/llm/getting-started.html#step-2-invoking-the-runtime)
-for more details
+Keep the rest of the code the same. For more details refer to [Exporting
+to ExecuTorch](#step-1-exporting-to-executorch) and [Invoking the
+Runtime](#step-2-invoking-the-runtime) for more details
 
 At this point, the working directory should contain the following files:
 
@@ -592,10 +590,8 @@ I'm not sure if you've heard of the "Curse of the Dragon" or not, but it's a ver
 The delegated model should be noticeably faster compared to the non-delegated model.
 
 For more information regarding backend delegateion, see the ExecuTorch guides
-for the
-[XNNPACK Backend](https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html)
-and
-[CoreML Backend](https://pytorch.org/executorch/stable/build-run-coreml.html).
+for the [XNNPACK Backend](../tutorial-xnnpack-delegate-lowering.md) and [CoreML
+Backend](../build-run-coreml.md).
 
 ## Quantization
 
@@ -681,7 +677,7 @@ target_link_libraries(
     xnnpack_backend) # Provides the XNNPACK CPU acceleration backend
 ```
 
-For more information, see [Quantization in ExecuTorch](https://pytorch.org/executorch/stable/quantization-overview.html).
+For more information, see [Quantization in ExecuTorch](../quantization-overview.md).
 
 ## Profiling and Debugging
 After lowering a model by calling `to_backend()`, you may want to see what got delegated and what didn’t. ExecuTorch
@@ -759,7 +755,7 @@ Through the ExecuTorch SDK, users are able to profile model execution, giving ti
 
 ##### ETRecord generation (Optional)
 
-An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [https://pytorch.org/executorch/main/sdk-etrecord.html](https://pytorch.org/executorch/main/sdk-etrecord.html)
+An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [the ETRecord docs](../sdk-etrecord.md).
 
 
 In your export script, after calling `to_edge()` and `to_executorch()`, call `generate_etrecord()` with the `EdgeProgramManager` from `to_edge()` and the `ExecuTorchProgramManager` from `to_executorch()`. Make sure to copy the `EdgeProgramManager`, as the call to `to_backend()` mutates the graph in-place.
@@ -781,7 +777,7 @@ Run the export script and the ETRecord will be generated as `etrecord.bin`.
 
 ##### ETDump generation
 
-An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [https://pytorch.org/executorch/main/sdk-etdump.html](https://pytorch.org/executorch/main/sdk-etdump.html)
+An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../sdk-etdump.md).
 
 Include the ETDump header in your code.
 ```cpp
@@ -851,7 +847,7 @@ This prints the performance data in a tabular format in “inspector_out.txt”,
 ![](../_static/img/llm_manual_print_data_tabular.png)
 <a href="../_static/img/llm_manual_print_data_tabular.png" target="_blank">View in full size</a>
 
-To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](https://pytorch.org/executorch/main/sdk-inspector.html).
+To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../sdk-inspector.md).
 
 ## Custom Kernels
 With the ExecuTorch custom operator APIs, custom operator and kernel authors can easily bring in their kernel into PyTorch/ExecuTorch.
@@ -929,7 +925,7 @@ torch.ops.load_library("libcustom_linear.so")
 Once loaded, you can use the custom operator in PyTorch code.
 
 For more information, see [PyTorch Custom Operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html) and
-and [ExecuTorch Kernel Registration](https://pytorch.org/executorch/stable/kernel-library-custom-aten-kernel.html).
+and [ExecuTorch Kernel Registration](../kernel-library-custom-aten-kernel.md).
 
 ### Using a Custom Operator in a Model
 
@@ -951,9 +947,8 @@ def  replace_linear_with_custom_linear(module):
 
 The remaining steps are the same as the normal flow. Now you can run this module in eager mode as well as export to ExecuTorch.
 
-## How to build Mobile Apps
-You can execute an LLM using ExecuTorch on iOS and Android.
+## How to Build Mobile Apps
+See the instructions for building and running LLMs using ExecuTorch on iOS and Android.
 
-**For iOS see the [iLLaMA App](https://pytorch.org/executorch/main/llm/llama-demo-ios.html).**
-
-**For Android, see the [Android Sample App](https://pytorch.org/executorch/main/llm/llama-demo-android.html).**
+* **[iOS ExecuTorch LLaMA Demo App](llama-demo-ios.md)**
+* **[Android ExecuTorch LLaMA Demo App](llama-demo-android.md)**