oneDPL 2022.8.0 release
New Features
- Added support of host policies for
histogramalgorithms. - Added support for an undersized output range in the range-based
mergealgorithm. - Improved performance of the
mergeand sorting algorithms
(sort,stable_sort,sort_by_key,stable_sort_by_key) that rely on Merge sort*,
with device policies for large data sizes. - Improved performance of
copy,fill,for_each,replace,reverse,rotate,transformand 30+
other algorithms with device policies on GPUs. - Improved oneDPL use with SYCL implementations other than Intel oneAPI DPC++/C++ Compiler.
Fixed Issues
-
Fixed an issue with
drop_viewin the experimental range-based API. -
Fixed compilation errors in
find_ifandfind_if_notwith device policies where the user provided predicate is
device copyable but not trivially copyable. -
Fixed incorrect results or synchronous SYCL exceptions for several algorithms when compiled with
-O0and executed
on a GPU device. -
Fixed an issue preventing inclusion of the
<numeric>header after<execution>and<algorithm>headers. -
Fixed several issues in the
sort,stable_sort,sort_by_keyandstable_sort_by_keyalgorithms that:- Allows the use of non-trivially-copyable comparators.
- Eliminates duplicate kernel names.
- Resolves incorrect results on devices with sub-group sizes smaller than four.
- Resolved synchronization errors that were seen on Intel® Arc™ ** B-series GPU devices.
Known Issues and Limitations
New in This Release
- Incorrect results may be observed when calling
sortwith a device policy on Intel® Arc™ graphics 140V with data
sizes of 4-8 million elements. sort,stable_sort,sort_by_keyandstable_sort_by_keyalgorithms fail to compile
when using Clang 17 and earlier versions, as well as compilers based on these versions,
such as Intel oneAPI DPC++/C++ Compiler 2023.2.0.- When compiling code that uses device policies with the open source oneAPI DPC++ Compiler (clang++ driver),
synchronous SYCL runtime exceptions regarding unfound kernels may be encountered unless an optimization flag is
specified (for example-O1) as opposed to relying on the compiler's default optimization level.
Existing Issues
See oneDPL Guide for other restrictions and known limitations.
histogramalgorithm requires the output value type to be an integral type no larger than four bytes
when used with an FPGA policy.histogrammay provide incorrect results with device policies in a program built with-O0option.- Compilation issues may be encountered when passing zip iterators to
exclusive_scan_by_segmenton Windows. - For
transform_exclusive_scanandexclusive_scanto run in-place (that is, with the same data
used for both input and destination) and with an execution policy ofunseqorpar_unseq,
it is required that the provided input and destination iterators are equality comparable.
Furthermore, the equality comparison of the input and destination iterator must evaluate to true.
If these conditions are not met, the result of these algorithm calls is undefined. - Incorrect results may be produced by
exclusive_scan,inclusive_scan,transform_exclusive_scan,
transform_inclusive_scan,exclusive_scan_by_segment,inclusive_scan_by_segment,reduce_by_segment
withunseqorpar_unseqpolicy when compiled by Intel® oneAPI DPC++/C++ Compiler
with-fiopenmp,-fiopenmp-simd,-qopenmp,-qopenmp-simdoptions on Linux.
To avoid the issue, pass-fopenmpor-fopenmp-simdoption instead.
*The sorting algorithms in oneDPL use Radix sort for arithmetic data types and
sycl::half (since oneDPL 2022.6) compared with std::less or std::greater, otherwise Merge sort.
**Intel, the Intel logo, and Arc are the trademarks of Intel Corporation or its subsidiaries.