Skip to content

Commit c6761f4

Browse files
authored
Merge updated documentation into master (#8638)
* New Getting Started documentation (#8179) WIP New getting started * Update documentation flow and add placeholders (#8287) Add placeholder top-level doc pages * Add new export + lowering docs, update getting started (#8412) Write new top-level export and lowering documentation * More doc placeholders (#8523) * Move cmake and faq docs to new location * Rename CMake build to Building from Source * Move backend docs to new locations (#8413) * Temporarily remove new backend pages * Move backend docs to new locations * Update backend titles and inline contents * Backend doc template (#8524) Add backend template, update XNNPACK docs * Add runtime integration documentation (#8516) Add runtime integration doc * Move iOS docs to top, add Android placeholders (#8511) * Temporarily remove using-executorch-ios.md * Move Apple runtime docs to new location * Clean up documentation placeholders and links, add top-level docs for C++ APIs, Android, and troubleshooting (#8618) * Clean up getting-started.md, remove placeholders * Move Android pre-built AAR info into top-level Android page * Add placeholder backend overview * Add placeholder troubleshooting docs * Populate top-level C++ API doc * Clean up additional doc placeholders and fix broken links * Add env setup instructions for source build * Fix getting started code snippet (#8637) Fix quotes in getting started code snippets * Clean up a few more doc sections and links (#8672) Clean up a few more broken links and sections in new doc flow * Fix QNN link, typo (#8729) * Add a CMake snippet to the XNNPACK backend doc build section (#8730) Add CMake example to xnnpack backend doc
1 parent 84273f4 commit c6761f4

39 files changed

+1313
-137
lines changed

backends/vulkan/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# ExecuTorch Vulkan Delegate
1+
# Vulkan Backend
22

33
The ExecuTorch Vulkan delegate is a native GPU delegate for ExecuTorch that is
44
built on top of the cross-platform Vulkan GPU API standard. It is primarily

docs/source/api-life-cycle.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# ExecuTorch API Life Cycle and Deprecation Policy
1+
# API Life Cycle and Deprecation Policy
22

33
## API Life Cycle
44

docs/source/native-delegates-executorch-xnnpack-delegate.md docs/source/backend-delegates-xnnpack-reference.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# ExecuTorch XNNPACK delegate
1+
# XNNPACK Delegate Internals
22

33
This is a high-level overview of the ExecuTorch XNNPACK backend delegate. This high performance delegate is aimed to reduce CPU inference latency for ExecuTorch models. We will provide a brief introduction to the XNNPACK library and explore the delegate’s overall architecture and intended use cases.
44

docs/source/backend-template.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Backend Template
2+
3+
## Features
4+
5+
## Target Requirements
6+
7+
## Development Requirements
8+
9+
## Lowering a Model to *Backend Name*
10+
11+
### Partitioner API
12+
13+
### Quantization
14+
15+
## Runtime Integration

docs/source/executorch-arm-delegate-tutorial.md docs/source/backends-arm-ethos-u.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
<!---- Name is a WIP - this reflects better what it can do today ----->
2-
# Building and Running ExecuTorch with ARM Ethos-U Backend
2+
# ARM Ethos-U Backend
33

44
<!----This will show a grid card on the page----->
55
::::{grid} 2
66

77
:::{grid-item-card} Tutorials we recommend you complete before this:
88
:class-card: card-prerequisites
99
* [Introduction to ExecuTorch](./intro-how-it-works.md)
10-
* [Setting up ExecuTorch](./getting-started-setup.md)
11-
* [Building ExecuTorch with CMake](./runtime-build-and-cross-compilation.md)
10+
* [Getting Started](./getting-started.md)
11+
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
1212
:::
1313

1414
:::{grid-item-card} What you will learn in this tutorial:
@@ -286,7 +286,7 @@ The `generate_pte_file` function in `run.sh` script produces the `.pte` files ba
286286

287287
ExecuTorch's CMake build system produces a set of build pieces which are critical for us to include and run the ExecuTorch runtime with-in the bare-metal environment we have for Corstone FVPs from Ethos-U SDK.
288288

289-
[This](./runtime-build-and-cross-compilation.md) document provides a detailed overview of each individual build piece. For running either variant of the `.pte` file, we will need a core set of libraries. Here is a list,
289+
[This](./using-executorch-building-from-source.md) document provides a detailed overview of each individual build piece. For running either variant of the `.pte` file, we will need a core set of libraries. Here is a list,
290290

291291
- `libexecutorch.a`
292292
- `libportable_kernels.a`

docs/source/build-run-xtensa.md docs/source/backends-cadence.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Building and Running ExecuTorch on Xtensa HiFi4 DSP
1+
# Cadence Xtensa Backend
22

33

44
In this tutorial we will walk you through the process of getting setup to build ExecuTorch for an Xtensa HiFi4 DSP and running a simple model on it.
@@ -17,9 +17,9 @@ On top of being able to run on the Xtensa HiFi4 DSP, another goal of this tutori
1717
:::
1818
:::{grid-item-card} Tutorials we recommend you complete before this:
1919
:class-card: card-prerequisites
20-
* [Introduction to ExecuTorch](intro-how-it-works.md)
21-
* [Setting up ExecuTorch](getting-started-setup.md)
22-
* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
20+
* [Introduction to ExecuTorch](./intro-how-it-works.md)
21+
* [Getting Started](./getting-started.md)
22+
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
2323
:::
2424
::::
2525

docs/source/build-run-coreml.md docs/source/backends-coreml.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Building and Running ExecuTorch with Core ML Backend
1+
# Core ML Backend
22

33
Core ML delegate uses Core ML APIs to enable running neural networks via Apple's hardware acceleration. For more about Core ML you can read [here](https://developer.apple.com/documentation/coreml). In this tutorial, we will walk through the steps of lowering a PyTorch model to Core ML delegate
44

@@ -11,9 +11,9 @@ Core ML delegate uses Core ML APIs to enable running neural networks via Apple's
1111
:::
1212
:::{grid-item-card} Tutorials we recommend you complete before this:
1313
:class-card: card-prerequisites
14-
* [Introduction to ExecuTorch](intro-how-it-works.md)
15-
* [Setting up ExecuTorch](getting-started-setup.md)
16-
* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
14+
* [Introduction to ExecuTorch](./intro-how-it-works.md)
15+
* [Getting Started](./getting-started.md)
16+
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
1717
* [ExecuTorch iOS Demo App](demo-apps-ios.md)
1818
:::
1919
::::

docs/source/build-run-mediatek-backend.md docs/source/backends-mediatek.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Building and Running ExecuTorch with MediaTek Backend
1+
# MediaTek Backend
22

33
MediaTek backend empowers ExecuTorch to speed up PyTorch models on edge devices that equips with MediaTek Neuron Processing Unit (NPU). This document offers a step-by-step guide to set up the build environment for the MediaTek ExecuTorch libraries.
44

@@ -11,9 +11,9 @@ MediaTek backend empowers ExecuTorch to speed up PyTorch models on edge devices
1111
:::
1212
:::{grid-item-card} Tutorials we recommend you complete before this:
1313
:class-card: card-prerequisites
14-
* [Introduction to ExecuTorch](intro-how-it-works.md)
15-
* [Setting up ExecuTorch](getting-started-setup.md)
16-
* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
14+
* [Introduction to ExecuTorch](./intro-how-it-works.md)
15+
* [Getting Started](./getting-started.md)
16+
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
1717
:::
1818
::::
1919

@@ -34,7 +34,7 @@ MediaTek backend empowers ExecuTorch to speed up PyTorch models on edge devices
3434

3535
Follow the steps below to setup your build environment:
3636

37-
1. **Setup ExecuTorch Environment**: Refer to the [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup) guide for detailed instructions on setting up the ExecuTorch environment.
37+
1. **Setup ExecuTorch Environment**: Refer to the [Getting Started](getting-started.md) guide for detailed instructions on setting up the ExecuTorch environment.
3838

3939
2. **Setup MediaTek Backend Environment**
4040
- Install the dependent libs. Ensure that you are inside `backends/mediatek/` directory
@@ -91,4 +91,4 @@ cd executorch
9191

9292
```bash
9393
export LD_LIBRARY_PATH=<path_to_usdk>:<path_to_neuron_backend>:$LD_LIBRARY_PATH
94-
```
94+
```

docs/source/backends-mps.md

+157
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# MPS Backend
2+
3+
In this tutorial we will walk you through the process of getting setup to build the MPS backend for ExecuTorch and running a simple model on it.
4+
5+
The MPS backend device maps machine learning computational graphs and primitives on the [MPS Graph](https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraph?language=objc) framework and tuned kernels provided by [MPS](https://developer.apple.com/documentation/metalperformanceshaders?language=objc).
6+
7+
::::{grid} 2
8+
:::{grid-item-card} What you will learn in this tutorial:
9+
:class-card: card-prerequisites
10+
* In this tutorial you will learn how to export [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the MPS delegate.
11+
* You will also learn how to compile and deploy the ExecuTorch runtime with the MPS delegate on macOS and iOS.
12+
:::
13+
:::{grid-item-card} Tutorials we recommend you complete before this:
14+
:class-card: card-prerequisites
15+
* [Introduction to ExecuTorch](./intro-how-it-works.md)
16+
* [Getting Started](./getting-started.md)
17+
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
18+
* [ExecuTorch iOS Demo App](demo-apps-ios.md)
19+
* [ExecuTorch iOS LLaMA Demo App](llm/llama-demo-ios.md)
20+
:::
21+
::::
22+
23+
24+
## Prerequisites (Hardware and Software)
25+
26+
In order to be able to successfully build and run a model using the MPS backend for ExecuTorch, you'll need the following hardware and software components:
27+
28+
### Hardware:
29+
- A [mac](https://www.apple.com/mac/) for tracing the model
30+
31+
### Software:
32+
33+
- **Ahead of time** tracing:
34+
- [macOS](https://www.apple.com/macos/) 12
35+
36+
- **Runtime**:
37+
- [macOS](https://www.apple.com/macos/) >= 12.4
38+
- [iOS](https://www.apple.com/ios) >= 15.4
39+
- [Xcode](https://developer.apple.com/xcode/) >= 14.1
40+
41+
## Setting up Developer Environment
42+
43+
***Step 1.*** Please finish tutorial [Getting Started](getting-started.md).
44+
45+
***Step 2.*** Install dependencies needed to lower MPS delegate:
46+
47+
```bash
48+
./backends/apple/mps/install_requirements.sh
49+
```
50+
51+
## Build
52+
53+
### AOT (Ahead-of-time) Components
54+
55+
**Compiling model for MPS delegate**:
56+
- In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the `.pte` file) during the runtime to run it using the MPS backend.
57+
58+
```bash
59+
cd executorch
60+
# Note: `mps_example` script uses by default the MPSPartitioner for ops that are not yet supported by the MPS delegate. To turn it off, pass `--no-use_partitioner`.
61+
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --bundled --use_fp16
62+
63+
# To see all options, run following command:
64+
python3 -m examples.apple.mps.scripts.mps_example --help
65+
```
66+
67+
### Runtime
68+
69+
**Building the MPS executor runner:**
70+
```bash
71+
# In this step, you'll be building the `mps_executor_runner` that is able to run MPS lowered modules:
72+
cd executorch
73+
./examples/apple/mps/scripts/build_mps_executor_runner.sh
74+
```
75+
76+
## Run the mv3 generated model using the mps_executor_runner
77+
78+
```bash
79+
./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program
80+
```
81+
82+
- You should see the following results. Note that no output file will be generated in this example:
83+
```
84+
I 00:00:00.003290 executorch:mps_executor_runner.mm:286] Model file mv3_mps_bundled_fp16.pte is loaded.
85+
I 00:00:00.003306 executorch:mps_executor_runner.mm:292] Program methods: 1
86+
I 00:00:00.003308 executorch:mps_executor_runner.mm:294] Running method forward
87+
I 00:00:00.003311 executorch:mps_executor_runner.mm:349] Setting up non-const buffer 1, size 606112.
88+
I 00:00:00.003374 executorch:mps_executor_runner.mm:376] Setting up memory manager
89+
I 00:00:00.003376 executorch:mps_executor_runner.mm:392] Loading method name from plan
90+
I 00:00:00.018942 executorch:mps_executor_runner.mm:399] Method loaded.
91+
I 00:00:00.018944 executorch:mps_executor_runner.mm:404] Loading bundled program...
92+
I 00:00:00.018980 executorch:mps_executor_runner.mm:421] Inputs prepared.
93+
I 00:00:00.118731 executorch:mps_executor_runner.mm:438] Model executed successfully.
94+
I 00:00:00.122615 executorch:mps_executor_runner.mm:501] Model verified successfully.
95+
```
96+
97+
### [Optional] Run the generated model directly using pybind
98+
1. Make sure `pybind` MPS support was installed:
99+
```bash
100+
./install_executorch.sh --pybind mps
101+
```
102+
2. Run the `mps_example` script to trace the model and run it directly from python:
103+
```bash
104+
cd executorch
105+
# Check correctness between PyTorch eager forward pass and ExecuTorch MPS delegate forward pass
106+
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --check_correctness
107+
# You should see following output: `Results between ExecuTorch forward pass with MPS backend and PyTorch forward pass for mv3_mps are matching!`
108+
109+
# Check performance between PyTorch MPS forward pass and ExecuTorch MPS forward pass
110+
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --bench_pytorch
111+
```
112+
113+
### Profiling:
114+
1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model.
115+
```bash
116+
cd executorch
117+
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b
118+
```
119+
2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./etdump.md).
120+
```
121+
./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs
122+
```
123+
3. Create an instance of the Inspector API by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1.
124+
```bash
125+
python3 -m sdk.inspector.inspector_cli --etdump_path etdump.etdp --etrecord_path etrecord.bin
126+
```
127+
128+
## Deploying and Running on Device
129+
130+
***Step 1***. Create the ExecuTorch core and MPS delegate frameworks to link on iOS
131+
```bash
132+
cd executorch
133+
./build/build_apple_frameworks.sh --mps
134+
```
135+
136+
`mps_delegate.xcframework` will be in `cmake-out` folder, along with `executorch.xcframework` and `portable_delegate.xcframework`:
137+
```bash
138+
cd cmake-out && ls
139+
```
140+
141+
***Step 2***. Link the frameworks into your XCode project:
142+
Go to project Target’s `Build Phases` - `Link Binaries With Libraries`, click the **+** sign and add the frameworks: files located in `Release` folder.
143+
- `executorch.xcframework`
144+
- `portable_delegate.xcframework`
145+
- `mps_delegate.xcframework`
146+
147+
From the same page, include the needed libraries for the MPS delegate:
148+
- `MetalPerformanceShaders.framework`
149+
- `MetalPerformanceShadersGraph.framework`
150+
- `Metal.framework`
151+
152+
In this tutorial, you have learned how to lower a model to the MPS delegate, build the mps_executor_runner and run a lowered model through the MPS delegate, or directly on device using the MPS delegate static library.
153+
154+
155+
## Frequently encountered errors and resolution.
156+
157+
If you encountered any bugs or issues following this tutorial please file a bug/issue on the [ExecuTorch repository](https://github.com/pytorch/executorch/issues), with hashtag **#mps**.

docs/source/backends-overview.md

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Backend Overview
2+
3+
ExecuTorch backends provide hardware acceleration for a specific hardware target. In order to achieve maximum performance on target hardware, ExecuTorch optimizes the model for a specific backend during the export and lowering process. This means that the resulting .pte file is specialized for the specific hardware. In order to deploy to multiple backends, such as Core ML on iOS and Arm CPU on Android, it is common to generate a dedicated .pte file for each.
4+
5+
The choice of hardware backend is informed by the hardware that the model is intended to be deployed on. Each backend has specific hardware requires and level of model support. See the documentation for each hardware backend for more details.
6+
7+
As part of the .pte file creation process, ExecuTorch identifies portions of the model (partitions) that are supported for the given backend. These sections are processed by the backend ahead of time to support efficient execution. Portions of the model that are not supported on the delegate, if any, are executed using the portable fallback implementation on CPU. This allows for partial model acceleration when not all model operators are supported on the backend, but may have negative performance implications. In addition, multiple partitioners can be specified in order of priority. This allows for operators not supported on GPU to run on CPU via XNNPACK, for example.
8+
9+
### Available Backends
10+
11+
Commonly used hardware backends are listed below. For mobile, consider using XNNPACK for Android and XNNPACK or Core ML for iOS. To create a .pte file for a specific backend, pass the appropriate partitioner class to `to_edge_transform_and_lower`. See the appropriate backend documentation for more information.
12+
13+
- [XNNPACK (Mobile CPU)](backends-xnnpack.md)
14+
- [Core ML (iOS)](backends-coreml.md)
15+
- [Metal Performance Shaders (iOS GPU)](backends-mps.md)
16+
- [Vulkan (Android GPU)](backends-vulkan.md)
17+
- [Qualcomm NPU](backends-qualcomm.md)
18+
- [MediaTek NPU](backends-mediatek.md)
19+
- [Arm Ethos-U NPU](backends-arm-ethos-u.md)
20+
- [Cadence DSP](backends-cadence.md)

docs/source/build-run-qualcomm-ai-engine-direct-backend.md docs/source/backends-qualcomm.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Building and Running ExecuTorch with Qualcomm AI Engine Direct Backend
1+
# Qualcomm AI Engine Backend
22

33
In this tutorial we will walk you through the process of getting started to
44
build ExecuTorch for Qualcomm AI Engine Direct and running a model on it.
@@ -14,9 +14,9 @@ Qualcomm AI Engine Direct is also referred to as QNN in the source and documenta
1414
:::
1515
:::{grid-item-card} Tutorials we recommend you complete before this:
1616
:class-card: card-prerequisites
17-
* [Introduction to ExecuTorch](intro-how-it-works.md)
18-
* [Setting up ExecuTorch](getting-started-setup.md)
19-
* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
17+
* [Introduction to ExecuTorch](./intro-how-it-works.md)
18+
* [Getting Started](./getting-started.md)
19+
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
2020
:::
2121
::::
2222

@@ -347,7 +347,7 @@ The model, inputs, and output location are passed to `qnn_executorch_runner` by
347347
### Running a model via ExecuTorch's android demo-app
348348

349349
An Android demo-app using Qualcomm AI Engine Direct Backend can be found in
350-
`examples`. Please refer to android demo app [tutorial](https://pytorch.org/executorch/stable/demo-apps-android.html).
350+
`examples`. Please refer to android demo app [tutorial](demo-apps-android.md).
351351

352352
## Supported model list
353353

0 commit comments

Comments
 (0)