v2.0-beta06
Pre-release
Pre-release
This is a preview release for oneDNN v2.0. The release is based on oneDNN v1.4.
Binary distribution of this software is available as Intel(R) oneAPI Deep Neural Network Library in Intel(R) oneAPI.
New Functionality
- Level Zero (L0) GPU runtime is used by default on Linux. OpenCL GPU runtime still can be used if SYCL_BE environment variable is set to PI_OPENCL before running a DPC++ program.
Known Limitations
- Level Zero GPU runtime is not supported on Windows OS.
- RNN functionality is not functional with Level Zero GPU runtime. The workaround is to use OpenCL GPU runtime via setting SYCL_BE=PI_OPENCL before running a DPC++ program.
- Zero Level runtime is enabled by default. Please make sure proper installation of zero level driver including level-zero-devel package following installation guide. If users still encounter runtime issue, please apply workaround to set SYCL_BE=PI_OPENCL before running a DPC++ program.
- Optimized primitives can crash or fail for huge spatial sizes on CPU.
- dnnl_sgemm, dnnl_gemm_u8s8u32, and inner product functionality does not support sizes exceeding 2^32.
- f32 convolutions may fail sporadically on Intel® Processor Graphics Gen11 due to a known issue in Intel Graphics Compiler.
- Non-Intel GPUs are not supported. The library API allows to create a DNNL engine by index (the order of devices is determined by the SYCL runtime), and there is no check for GPU devices being non-Intel. To have more control, users can create a DNNL engine passing SYCL device and context explicitly.
- When running GPU kernels that take longer than a certain time (it depends on OS and system settings) you may face a situation resulting in apparent hang of the application. Configure driver to disable this timeout and avoid hanging of DPC++ or OpenCL programs, including DNNL examples.
On Linux:
$ sudo bash -c 'echo N > /sys/module/i915/parameters/enable_hangcheck'
On Windows increase TdrDelay and TdrDdiDelay values using registry.