Skip to content

Commit b8e6d23

Browse files
dongkyunahn-intelkbobrovsErich Keane
authored
[SYCL][ESIMD][DOC] Description on ESIMD_EMULATOR backend (#5923)
* [SYCL][ESIMD][DOC] Description on ESIMD_EMULATOR backend Co-authored-by: kbobrovs <[email protected]> Co-authored-by: Erich Keane <[email protected]>
1 parent 1192a4c commit b8e6d23

File tree

3 files changed

+133
-3
lines changed

3 files changed

+133
-3
lines changed

sycl/doc/extensions/experimental/sycl_ext_intel_esimd/ESIMD-TODO-list.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,3 @@ github issues mechanism.
1717
For now the check is only if FD is '()' operator. Works OK for today's
1818
handler::kernel_parallel_for/... implementations as no other '()' operators
1919
are invoked except the kernel body.
20-

sycl/doc/extensions/experimental/sycl_ext_intel_esimd/README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ objects and `copy_to` intrinsics are used which are avaiable only in the ESIMD e
4747
Full runnable code sample can be found on the
4848
[github repo](https://github.com/intel/llvm-test-suite/blob/intel/SYCL/ESIMD/vadd_usm.cpp).
4949
50-
#### Compiling and running ESIMD code.
50+
#### Compiling and running ESIMD code
5151
5252
Code that uses the ESIMD extension can be compiled and run using the same commands
5353
as standard SYCL:
@@ -70,6 +70,12 @@ Regular SYCL and ESIMD kernels can co-exist in the same translation unit and in
7070
the same application, however interoperability (e.g. invocation of ESIMD
7171
functions from a standard SYCL code) between them is not yet supported.
7272
73+
#### ESIMD_EMULATOR backend
74+
75+
Under Linux environment, the same resulting executable file can be run
76+
on CPU under emulation mode without Intel GPU. For details, check
77+
[ESIMD_EMULATOR bakend] (esimd_emulator.md)
78+
7379
#### Restrictions
7480
7581
This section contains lists of the main restrictions that apply when using the ESIMD
@@ -95,6 +101,7 @@ done via explicit APIs; e.g. `sycl::ext::intel::experimental::esimd::block_store
95101
- `sycl::sampler` and `sycl::stream` classes
96102
97103
##### Other restrictions:
98-
- Only Intel GPU device is supported
104+
105+
- Only Intel GPU device is supported.
99106
- Interoperability between regular SYCL and ESIMD kernels is not yet supported.
100107
I.e., it's not possible to invoke an ESIMD kernel from SYCL kernel and vice-versa.
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
# ESIMD kernel execution emulation on host.
2+
3+
## Introduction
4+
5+
ESIMD implementation provides a feature to execute ESIMD kernels on the host
6+
CPU without having actual Intel GPU device in the system - this is ESIMD emulator.
7+
It's main purpose is to provide users with a way to conveniently debug ESIMD code
8+
in their favorite debuggers. Performance is not a priority for now and it will like be quite
9+
low. Since the emulator tries to model massively parallel GPU kernel execution on CPU
10+
hardware, some differences in execution profile may happen, and this must be taken
11+
into account when debugging. Redirecting execution to ESIMD emulator is as simple as
12+
setting an environment variable, no program recompilation is needed. When running a
13+
kernel via the emulator, SYCL runtime will see the emulator as normal GPU device - i.e.
14+
`is_gpu()` test will return true for it.
15+
16+
Due to specifics of ESIMD programming model, usual SYCL host device can't execute
17+
ESIMD kernels. For example, it needs some supporting libraries to emulate various kinds
18+
of barriers, GPU execution threads. It would be impractical for host part of a SYCL ESIMD
19+
app to include or link to all the necessary infrastructure components, as it is not needed
20+
in most cases, when there is no ESIMD code or no debugging is wanted. It would also be
21+
inconvenient or even not possible for users to recompile the app with some switch to
22+
execute ESIMD part on CPU. The environment variable plus a separate back-end solve
23+
both problems.
24+
25+
ESIMD emulator encompasses a the following main components:
26+
1) The ESIMD emulator plugin which is a SYCL runtime back-end similar to OpenCL or
27+
LevelZero.
28+
2) Host implementations of low-level ESIMD intrinsics such as `__esimd_scatter_scaled`.
29+
3) The supporting infrastructure linked dynamically to the plugin - the `libCM` library.
30+
31+
See a specific section below for main ESIMD emulator limitations.
32+
33+
## Requirements
34+
35+
ESIMD_EMULATOR backend uses [CM_EMU
36+
library](https://github.com/intel/cm-cpu-emulation) for emulating GPU
37+
using software multi-threading. The library can be either provided as
38+
separate pre-installed library in host machine or built as part of
39+
open-source Intel DPC++ compiler. Required version for CM_EMU is
40+
[1.0.20](https://github.com/intel/cm-cpu-emulation/releases/tag/v2022-02-11)
41+
or later. In order to have CM_EMU library as part of Intel DPC++
42+
compiler for ESIMD_EMULATOR backend, the library needs to be built
43+
during ESIMD_EMULATOR plug-in software module generation. Details on
44+
building CM_EMU library for ESIMD_EMULATOR such as required packages
45+
are described in [ESIMD CPU Emulation](https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#build-dpc-toolchain-with-support-for-esimd-cpu-emulation)
46+
47+
## Command line option / environment variable options
48+
49+
There is no special command line option or environment variable
50+
required for building and running ESIMD kernels with ESIMD_EMULATOR
51+
backend.
52+
53+
## Running ESIMD code under emulation mode
54+
55+
Compilation step for ESIMD kernels prepared for ESIMD_EMULATOR backend
56+
is same as for OpenCL and Level Zero backends. Full runnable code
57+
sample used below can be found on the [github
58+
repo](https://github.com/intel/llvm-test-suite/blob/intel/SYCL/ESIMD/vadd_usm.cpp).
59+
60+
To compile using the open-source Intel DPC++ compiler:
61+
> `$ clang++ -fsycl vadd_usm.cpp`
62+
63+
To compile using Intel(R) OneAPI Toolkit:
64+
> `$ dpcpp vadd_usm.cpp`
65+
66+
To run under emulation through ESIMD_EMULATOR backend:
67+
> `$ SYCL_DEVICE_FILTER=ext_intel_esimd_emulator:gpu ./a.out`
68+
69+
## Running ESIMD examples from [ESIMD test suite](https://github.com/intel/llvm-test-suite/tree/intel/SYCL/ESIMD) on github with ESIMD_EMULATOR backend
70+
71+
```
72+
# Get sources
73+
git clone https://github.com/intel/llvm-test-suite
74+
cd llvm-test-suite
75+
mkdir build && cd build
76+
77+
# Configure for make utility with compiler tools available in $PATH
78+
cmake \
79+
-DCMAKE_CXX_COMPILER=clang++ \
80+
-DTEST_SUITE_SUBDIRS=SYCL \
81+
-DSYCL_BE="ext_intel_esimd_emulator" \
82+
-DSYCL_TARGET_DEVICES="gpu" \
83+
..
84+
85+
# Build and Run
86+
make check
87+
88+
# Or, for Ninja utility
89+
cmake -G Ninja \
90+
-DCMAKE_CXX_COMPILER=clang++ \
91+
-DTEST_SUITE_SUBDIRS=SYCL \
92+
-DSYCL_BE="ext_intel_esimd_emulator" \
93+
-DSYCL_TARGET_DEVICES="gpu" \
94+
..
95+
96+
# Build and Run
97+
ninja check
98+
99+
```
100+
101+
Note that only [ESIMD Kernels](https://github.com/intel/llvm-test-suite/tree/intel/SYCL/ESIMD) are
102+
tested with above command examples due to ESIMD_EMULATOR's limitations
103+
below.
104+
105+
## Limitation
106+
- The emulator is available only on Linux for now. Windows support is WIP.
107+
- ESIMD_EMULATOR has limitation on number of threads under Linux. As
108+
software multi-threading is used for emulating hardware threads,
109+
number of threads being launched for kernel execution is limited by
110+
the max number of threads supported by Linux host machine.
111+
112+
- ESIMD_EMULATOR supports only ESIMD kernels. This means kernels
113+
written for SYCL cannot run with ESIMD_EMULATOR backend. This also
114+
means that kernels containing both SYCL and ESIMD code cannot run with
115+
ESIMD_EMULATOR, unlike GPU backends like OpenCL or Level Zero.
116+
117+
- ESIMD_EMULATOR cannot run in parallel with Host Device.
118+
119+
## TODO
120+
121+
- Windows environment support
122+
123+
- Support for arithmetic operations for 16-bit half floating point
124+
number type

0 commit comments

Comments
 (0)