|
| 1 | +# MPS Backend |
| 2 | + |
| 3 | +In this tutorial we will walk you through the process of getting setup to build the MPS backend for ExecuTorch and running a simple model on it. |
| 4 | + |
| 5 | +The MPS backend device maps machine learning computational graphs and primitives on the [MPS Graph](https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraph?language=objc) framework and tuned kernels provided by [MPS](https://developer.apple.com/documentation/metalperformanceshaders?language=objc). |
| 6 | + |
| 7 | +::::{grid} 2 |
| 8 | +:::{grid-item-card} What you will learn in this tutorial: |
| 9 | +:class-card: card-prerequisites |
| 10 | +* In this tutorial you will learn how to export [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the MPS delegate. |
| 11 | +* You will also learn how to compile and deploy the ExecuTorch runtime with the MPS delegate on macOS and iOS. |
| 12 | +::: |
| 13 | +:::{grid-item-card} Tutorials we recommend you complete before this: |
| 14 | +:class-card: card-prerequisites |
| 15 | +* [Introduction to ExecuTorch](./intro-how-it-works.md) |
| 16 | +* [Getting Started](./getting-started.md) |
| 17 | +* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md) |
| 18 | +* [ExecuTorch iOS Demo App](demo-apps-ios.md) |
| 19 | +* [ExecuTorch iOS LLaMA Demo App](llm/llama-demo-ios.md) |
| 20 | +::: |
| 21 | +:::: |
| 22 | + |
| 23 | + |
| 24 | +## Prerequisites (Hardware and Software) |
| 25 | + |
| 26 | +In order to be able to successfully build and run a model using the MPS backend for ExecuTorch, you'll need the following hardware and software components: |
| 27 | + |
| 28 | +### Hardware: |
| 29 | + - A [mac](https://www.apple.com/mac/) for tracing the model |
| 30 | + |
| 31 | +### Software: |
| 32 | + |
| 33 | + - **Ahead of time** tracing: |
| 34 | + - [macOS](https://www.apple.com/macos/) 12 |
| 35 | + |
| 36 | + - **Runtime**: |
| 37 | + - [macOS](https://www.apple.com/macos/) >= 12.4 |
| 38 | + - [iOS](https://www.apple.com/ios) >= 15.4 |
| 39 | + - [Xcode](https://developer.apple.com/xcode/) >= 14.1 |
| 40 | + |
| 41 | +## Setting up Developer Environment |
| 42 | + |
| 43 | +***Step 1.*** Please finish tutorial [Getting Started](getting-started.md). |
| 44 | + |
| 45 | +***Step 2.*** Install dependencies needed to lower MPS delegate: |
| 46 | + |
| 47 | + ```bash |
| 48 | + ./backends/apple/mps/install_requirements.sh |
| 49 | + ``` |
| 50 | + |
| 51 | +## Build |
| 52 | + |
| 53 | +### AOT (Ahead-of-time) Components |
| 54 | + |
| 55 | +**Compiling model for MPS delegate**: |
| 56 | +- In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the `.pte` file) during the runtime to run it using the MPS backend. |
| 57 | + |
| 58 | +```bash |
| 59 | +cd executorch |
| 60 | +# Note: `mps_example` script uses by default the MPSPartitioner for ops that are not yet supported by the MPS delegate. To turn it off, pass `--no-use_partitioner`. |
| 61 | +python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --bundled --use_fp16 |
| 62 | + |
| 63 | +# To see all options, run following command: |
| 64 | +python3 -m examples.apple.mps.scripts.mps_example --help |
| 65 | +``` |
| 66 | + |
| 67 | +### Runtime |
| 68 | + |
| 69 | +**Building the MPS executor runner:** |
| 70 | +```bash |
| 71 | +# In this step, you'll be building the `mps_executor_runner` that is able to run MPS lowered modules: |
| 72 | +cd executorch |
| 73 | +./examples/apple/mps/scripts/build_mps_executor_runner.sh |
| 74 | +``` |
| 75 | + |
| 76 | +## Run the mv3 generated model using the mps_executor_runner |
| 77 | + |
| 78 | +```bash |
| 79 | +./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program |
| 80 | +``` |
| 81 | + |
| 82 | +- You should see the following results. Note that no output file will be generated in this example: |
| 83 | +``` |
| 84 | +I 00:00:00.003290 executorch:mps_executor_runner.mm:286] Model file mv3_mps_bundled_fp16.pte is loaded. |
| 85 | +I 00:00:00.003306 executorch:mps_executor_runner.mm:292] Program methods: 1 |
| 86 | +I 00:00:00.003308 executorch:mps_executor_runner.mm:294] Running method forward |
| 87 | +I 00:00:00.003311 executorch:mps_executor_runner.mm:349] Setting up non-const buffer 1, size 606112. |
| 88 | +I 00:00:00.003374 executorch:mps_executor_runner.mm:376] Setting up memory manager |
| 89 | +I 00:00:00.003376 executorch:mps_executor_runner.mm:392] Loading method name from plan |
| 90 | +I 00:00:00.018942 executorch:mps_executor_runner.mm:399] Method loaded. |
| 91 | +I 00:00:00.018944 executorch:mps_executor_runner.mm:404] Loading bundled program... |
| 92 | +I 00:00:00.018980 executorch:mps_executor_runner.mm:421] Inputs prepared. |
| 93 | +I 00:00:00.118731 executorch:mps_executor_runner.mm:438] Model executed successfully. |
| 94 | +I 00:00:00.122615 executorch:mps_executor_runner.mm:501] Model verified successfully. |
| 95 | +``` |
| 96 | + |
| 97 | +### [Optional] Run the generated model directly using pybind |
| 98 | +1. Make sure `pybind` MPS support was installed: |
| 99 | +```bash |
| 100 | +./install_executorch.sh --pybind mps |
| 101 | +``` |
| 102 | +2. Run the `mps_example` script to trace the model and run it directly from python: |
| 103 | +```bash |
| 104 | +cd executorch |
| 105 | +# Check correctness between PyTorch eager forward pass and ExecuTorch MPS delegate forward pass |
| 106 | +python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --check_correctness |
| 107 | +# You should see following output: `Results between ExecuTorch forward pass with MPS backend and PyTorch forward pass for mv3_mps are matching!` |
| 108 | + |
| 109 | +# Check performance between PyTorch MPS forward pass and ExecuTorch MPS forward pass |
| 110 | +python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --bench_pytorch |
| 111 | +``` |
| 112 | + |
| 113 | +### Profiling: |
| 114 | +1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model. |
| 115 | +```bash |
| 116 | +cd executorch |
| 117 | +python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b |
| 118 | +``` |
| 119 | +2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./etdump.md). |
| 120 | +``` |
| 121 | +./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs |
| 122 | +``` |
| 123 | +3. Create an instance of the Inspector API by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1. |
| 124 | +```bash |
| 125 | +python3 -m sdk.inspector.inspector_cli --etdump_path etdump.etdp --etrecord_path etrecord.bin |
| 126 | +``` |
| 127 | + |
| 128 | +## Deploying and Running on Device |
| 129 | + |
| 130 | +***Step 1***. Create the ExecuTorch core and MPS delegate frameworks to link on iOS |
| 131 | +```bash |
| 132 | +cd executorch |
| 133 | +./build/build_apple_frameworks.sh --mps |
| 134 | +``` |
| 135 | + |
| 136 | +`mps_delegate.xcframework` will be in `cmake-out` folder, along with `executorch.xcframework` and `portable_delegate.xcframework`: |
| 137 | +```bash |
| 138 | +cd cmake-out && ls |
| 139 | +``` |
| 140 | + |
| 141 | +***Step 2***. Link the frameworks into your XCode project: |
| 142 | +Go to project Target’s `Build Phases` - `Link Binaries With Libraries`, click the **+** sign and add the frameworks: files located in `Release` folder. |
| 143 | +- `executorch.xcframework` |
| 144 | +- `portable_delegate.xcframework` |
| 145 | +- `mps_delegate.xcframework` |
| 146 | + |
| 147 | +From the same page, include the needed libraries for the MPS delegate: |
| 148 | +- `MetalPerformanceShaders.framework` |
| 149 | +- `MetalPerformanceShadersGraph.framework` |
| 150 | +- `Metal.framework` |
| 151 | + |
| 152 | +In this tutorial, you have learned how to lower a model to the MPS delegate, build the mps_executor_runner and run a lowered model through the MPS delegate, or directly on device using the MPS delegate static library. |
| 153 | + |
| 154 | + |
| 155 | +## Frequently encountered errors and resolution. |
| 156 | + |
| 157 | +If you encountered any bugs or issues following this tutorial please file a bug/issue on the [ExecuTorch repository](https://github.com/pytorch/executorch/issues), with hashtag **#mps**. |
0 commit comments