|
| 1 | +# ESIMD kernel execution emulation on host. |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +ESIMD implementation provides a feature to execute ESIMD kernels on the host |
| 6 | +CPU without having actual Intel GPU device in the system - this is ESIMD emulator. |
| 7 | +It's main purpose is to provide users with a way to conveniently debug ESIMD code |
| 8 | +in their favorite debuggers. Performance is not a priority for now and it will like be quite |
| 9 | +low. Since the emulator tries to model massively parallel GPU kernel execution on CPU |
| 10 | +hardware, some differences in execution profile may happen, and this must be taken |
| 11 | +into account when debugging. Redirecting execution to ESIMD emulator is as simple as |
| 12 | +setting an environment variable, no program recompilation is needed. When running a |
| 13 | +kernel via the emulator, SYCL runtime will see the emulator as normal GPU device - i.e. |
| 14 | +`is_gpu()` test will return true for it. |
| 15 | + |
| 16 | +Due to specifics of ESIMD programming model, usual SYCL host device can't execute |
| 17 | +ESIMD kernels. For example, it needs some supporting libraries to emulate various kinds |
| 18 | +of barriers, GPU execution threads. It would be impractical for host part of a SYCL ESIMD |
| 19 | +app to include or link to all the necessary infrastructure components, as it is not needed |
| 20 | +in most cases, when there is no ESIMD code or no debugging is wanted. It would also be |
| 21 | +inconvenient or even not possible for users to recompile the app with some switch to |
| 22 | +execute ESIMD part on CPU. The environment variable plus a separate back-end solve |
| 23 | +both problems. |
| 24 | + |
| 25 | +ESIMD emulator encompasses a the following main components: |
| 26 | +1) The ESIMD emulator plugin which is a SYCL runtime back-end similar to OpenCL or |
| 27 | +LevelZero. |
| 28 | +2) Host implementations of low-level ESIMD intrinsics such as `__esimd_scatter_scaled`. |
| 29 | +3) The supporting infrastructure linked dynamically to the plugin - the `libCM` library. |
| 30 | + |
| 31 | +See a specific section below for main ESIMD emulator limitations. |
| 32 | + |
| 33 | +## Requirements |
| 34 | + |
| 35 | +ESIMD_EMULATOR backend uses [CM_EMU |
| 36 | +library](https://github.com/intel/cm-cpu-emulation) for emulating GPU |
| 37 | +using software multi-threading. The library can be either provided as |
| 38 | +separate pre-installed library in host machine or built as part of |
| 39 | +open-source Intel DPC++ compiler. Required version for CM_EMU is |
| 40 | +[1.0.20](https://github.com/intel/cm-cpu-emulation/releases/tag/v2022-02-11) |
| 41 | +or later. In order to have CM_EMU library as part of Intel DPC++ |
| 42 | +compiler for ESIMD_EMULATOR backend, the library needs to be built |
| 43 | +during ESIMD_EMULATOR plug-in software module generation. Details on |
| 44 | +building CM_EMU library for ESIMD_EMULATOR such as required packages |
| 45 | +are described in [ESIMD CPU Emulation](https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#build-dpc-toolchain-with-support-for-esimd-cpu-emulation) |
| 46 | + |
| 47 | +## Command line option / environment variable options |
| 48 | + |
| 49 | +There is no special command line option or environment variable |
| 50 | +required for building and running ESIMD kernels with ESIMD_EMULATOR |
| 51 | +backend. |
| 52 | + |
| 53 | +## Running ESIMD code under emulation mode |
| 54 | + |
| 55 | +Compilation step for ESIMD kernels prepared for ESIMD_EMULATOR backend |
| 56 | +is same as for OpenCL and Level Zero backends. Full runnable code |
| 57 | +sample used below can be found on the [github |
| 58 | +repo](https://github.com/intel/llvm-test-suite/blob/intel/SYCL/ESIMD/vadd_usm.cpp). |
| 59 | + |
| 60 | +To compile using the open-source Intel DPC++ compiler: |
| 61 | +> `$ clang++ -fsycl vadd_usm.cpp` |
| 62 | +
|
| 63 | +To compile using Intel(R) OneAPI Toolkit: |
| 64 | +> `$ dpcpp vadd_usm.cpp` |
| 65 | +
|
| 66 | +To run under emulation through ESIMD_EMULATOR backend: |
| 67 | +> `$ SYCL_DEVICE_FILTER=ext_intel_esimd_emulator:gpu ./a.out` |
| 68 | +
|
| 69 | +## Running ESIMD examples from [ESIMD test suite](https://github.com/intel/llvm-test-suite/tree/intel/SYCL/ESIMD) on github with ESIMD_EMULATOR backend |
| 70 | + |
| 71 | +``` |
| 72 | +# Get sources |
| 73 | +git clone https://github.com/intel/llvm-test-suite |
| 74 | +cd llvm-test-suite |
| 75 | +mkdir build && cd build |
| 76 | +
|
| 77 | +# Configure for make utility with compiler tools available in $PATH |
| 78 | +cmake \ |
| 79 | + -DCMAKE_CXX_COMPILER=clang++ \ |
| 80 | + -DTEST_SUITE_SUBDIRS=SYCL \ |
| 81 | + -DSYCL_BE="ext_intel_esimd_emulator" \ |
| 82 | + -DSYCL_TARGET_DEVICES="gpu" \ |
| 83 | + .. |
| 84 | +
|
| 85 | +# Build and Run |
| 86 | +make check |
| 87 | +
|
| 88 | +# Or, for Ninja utility |
| 89 | +cmake -G Ninja \ |
| 90 | + -DCMAKE_CXX_COMPILER=clang++ \ |
| 91 | + -DTEST_SUITE_SUBDIRS=SYCL \ |
| 92 | + -DSYCL_BE="ext_intel_esimd_emulator" \ |
| 93 | + -DSYCL_TARGET_DEVICES="gpu" \ |
| 94 | + .. |
| 95 | +
|
| 96 | +# Build and Run |
| 97 | +ninja check |
| 98 | +
|
| 99 | +``` |
| 100 | + |
| 101 | +Note that only [ESIMD Kernels](https://github.com/intel/llvm-test-suite/tree/intel/SYCL/ESIMD) are |
| 102 | +tested with above command examples due to ESIMD_EMULATOR's limitations |
| 103 | +below. |
| 104 | + |
| 105 | +## Limitation |
| 106 | +- The emulator is available only on Linux for now. Windows support is WIP. |
| 107 | +- ESIMD_EMULATOR has limitation on number of threads under Linux. As |
| 108 | +software multi-threading is used for emulating hardware threads, |
| 109 | +number of threads being launched for kernel execution is limited by |
| 110 | +the max number of threads supported by Linux host machine. |
| 111 | + |
| 112 | +- ESIMD_EMULATOR supports only ESIMD kernels. This means kernels |
| 113 | +written for SYCL cannot run with ESIMD_EMULATOR backend. This also |
| 114 | +means that kernels containing both SYCL and ESIMD code cannot run with |
| 115 | +ESIMD_EMULATOR, unlike GPU backends like OpenCL or Level Zero. |
| 116 | + |
| 117 | +- ESIMD_EMULATOR cannot run in parallel with Host Device. |
| 118 | + |
| 119 | +## TODO |
| 120 | + |
| 121 | +- Windows environment support |
| 122 | + |
| 123 | +- Support for arithmetic operations for 16-bit half floating point |
| 124 | +number type |
0 commit comments