-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL-MLIR] Merge from intel/llvm sycl branch #8774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Due to incorrect substitution wrong command line was formed on Windows.
According to the SYCL 2020 specification, the reducer class should be neither moveable nor copyable. This commit deletes these constructors and assignment operators. Fixes intel#6065 --------- Signed-off-by: Larsen, Steffen <[email protected]>
This patch adds marray support to all functions from Table 179 of SYCL 2020 spec + to functions fabs, ilogb, fmax, fmin, ldexp, pown, rootn from Table 175 + to function exp10 from Table 177. E2E tests: intel/llvm-test-suite#1656 --------- Co-authored-by: KornevNikita <[email protected]>
Recent changes to the sycl::vec class added a private constructor taking and array. This resulted in constructor ambiguity when passing an initializer list, despite the constructor being private. This commit removes the constructor, making the implementation use the constructor taking both an array and an index sequence directly. --------- Signed-off-by: Larsen, Steffen <[email protected]> Co-authored-by: Alexey Sachkov <[email protected]>
In certain cases get_specialization_constant would cause a segmentation fault, likely due to strict aliasing violations. This commit changes the implementation to use memcpy of the data into the resulting object, as this can be assumed to be valid due to specialization constants being device-copyable. --------- Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
…l#8713) This commit makes a clarification to sycl_ext_oneapi_weak_object that the new interfaces are only available on the host application. --------- Signed-off-by: Larsen, Steffen <[email protected]> Co-authored-by: Greg Lueck <[email protected]>
…ted (intel#8523) When fetching device images compatible with non-input states, we can ignore an image if another one with a higher state is available for all the possible kernel-device pairs. This patch adds the logic for filtering out such unnecessary images so that we can avoid JIT compilation if both AOT and SPIRV images are present.
Signed-off-by: John Pennycook <[email protected]>
This commit splits aspect propagation into two runs: 1. First run propagates all aspects, except fp64. Warnings are still issued for fp64 as if it was fully propagated, but the resulting metadata will not reflect it. This run before optimizations. 2. Second run propagates all aspects, including fp64. This should not have any effect on already propagated aspects. This run will not issue warnings as any conflicts would have been reported by the first pass. See the [design document](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/OptionalDeviceFeatures.md#pre--and-post-optimization-aspect-propagation) for more information. --------- Signed-off-by: Larsen, Steffen <[email protected]>
SYCL 2020 has any and all return bool for scalar and marray arguments, but int for vector arguments. Currently enabling SYCL2020_CONFORMANT_APIS switches all versions of any and all to return bool. This commit changes it so that the variants taking vector arguments are unaffected by SYCL2020_CONFORMANT_APIS and language version. Signed-off-by: Larsen, Steffen <[email protected]>
This commit fixes an issue where the copy and move assignment operators of the weak_object class would be implicitly deleted due to them being missing from the base class. Signed-off-by: Larsen, Steffen <[email protected]>
This commit adds the missing SYCL 2020 is_group trait and changes the definition of is_group_v to be in line with SYCL 2020. Fixes intel#8704. Signed-off-by: Larsen, Steffen <[email protected]>
This was identified by the Unified Runtime team, depending on the target order in `-fsycl-targets` certain binaries are not dumped. ``` -fsycl-targets=nvptx64-nvidia-cuda,spir64 ├── sycl_nvptx641.bin ├── sycl_spir642.spv └── sycl_spir643.spv ``` ``` -fsycl-targets=spir64,nvptx64-nvidia-cuda ├── sycl_spir641.spv ├── sycl_spir642.spv └── sycl_spir643.spv ``` This patch makes sure that if a kernel is compiled for multiple targets, images belonging to different targets will be dumped.
…r::get_pointer. (intel#8493) * accessor::get_pointer and local_accessor::get_pointer return `std::add_pointer_t<value_type>` * Modifies multi_ptr ctors accepting local_accessor. * Improves existing test to check the return type of get_pointer. --------- Co-authored-by: Steffen Larsen <[email protected]>
This patch fixes an outdated line number that is referenced in the documentation.
…enCL and Level Zero (intel#8595) Adds support to query devices for `atomic_memory_scope_capabilities`. The backends supported are OpenCL and Level Zero. For the rest of backends, it has been left unsupported. --------- Signed-off-by: Maronas, Marcos <[email protected]>
…ntel#8698) This works around a VC issue. Signed-off-by: Sarnie, Nick <[email protected]>
Correct a few values in compileTimeProperties according to SPIRV specs. And skip collecting sycl-alignment to SPIRV.ParamDecoration metadata
…intel#8749) Signed-off-by: Maronas, Marcos <[email protected]>
This commit reenables the fp64 aspect check when ensuring compatibility of a given device image, following the relaxation of aspect propagation for the aspect. --------- Signed-off-by: Larsen, Steffen <[email protected]>
…ntel#8517) This patch implements the `atomic_memory_order_capabilities` query in the OpenCL and Level Zero backends/plugins for `device` and `context` Specifically: - OpenCL <2.0 returns the minimum required capability set (`relaxed`) defined in [Section 4.2 of the OpenCL 3.0 specification](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES). - OpenCL <3.0 and Level Zero backends return all memory order capabilities. - OpenCL >=3.0 queries the actual device to get the supported memory order capabilities. E2E test have also been updated to reflect these changes: intel/llvm-test-suite#1627
Tests for min and max builtins fail on Windows with recent changes. This is due to these being defined as macros, causing their use inside the generators to fail. This commit fixes this by enclosing them in parentheses. Fixes intel#8717. Signed-off-by: Larsen, Steffen <[email protected]>
All tools are switched to opaque pointers already except those we use in compilation for offload targets.
intel#8743) When processing archives when performing offload, the driver will perform a host link which generates a file that is used to determine any offload dependencies. This additional link step is just for gathering the dependency information and shouldn't require the full ability to link. Under some circumstances, for example generating an early or final image archive for FPGA, the symbol resolution is not important. Add a general case of -z undefs during this internal link step and allow the final host link to determine any unresolved symbols at that time.
Completes the extensions for HIP.
Aligns local accessor index calculation with the other accessors.
bfloat16_wrapper.cpp requires SYCL headers, so ensure that they've been staged to the build directory before compiling.
OwnZeMemHandle is used to track whether an allocation should be freed by the runtime or not. There is no point in passing this flag to USMAllocator and then to USMFreeHelper since the allocation cannot come from the USMAllocator (it can only be from zeMemAlloc*). I've thought about moving the memory allocation code to UR first, but there are a lot of dependencies: we would have to move the implementation of context, device, event, and queue at least. It seemed that doing this small change first made more sense.
…#8247) This commit changes the configuration of in-tree SYCL LIT tests to include features for when the L0 and OpenCL plugins are enabled. This allows the interop backend tests to be split into backend-dependent tests to be run only when their backends are enabled. Additionally, this commit changes the fsycl-host-only LIT configuration replacement to ignore unset include paths, avoiding empty "-isystem" options when related backends are disabled. Signed-off-by: Larsen, Steffen <[email protected]>
Compiling kernels for it is not supported, device info sometimes throws, and there is the HIP backend for those wanting to use an AMD GPU. ```console $ sycl-ls # before [opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 770 [0x4680] 3.0 [22.35.24055] [opencl:gpu:1] AMD Accelerated Parallel Processing, gfx1032 2.0 [3452.0 (HSA1.1,LC)] [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 [0x4680] 1.3 [1.3.24055] [ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 11.7] [ext_oneapi_hip:gpu:0] AMD HIP BACKEND, gfx1032 0.0 [HIP 50221.15] $ sycl-ls # after [opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 770 [0x4680] 3.0 [22.35.24055] [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 [0x4680] 1.3 [1.3.24055] [ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 11.7] [ext_oneapi_hip:gpu:0] AMD HIP BACKEND, gfx1032 0.0 [HIP 50221.15] ``` Closes intel#5825
I cannot reproduce the error from
The change came from #8517, and it could be built successfully. |
@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this? |
I cannot reproduce it locally, but it might be other change #8637. could you try w/o that patch? |
It's hard to say for sure if reverting this patch removes the data race, but a few tries went okay. |
…8772) Based on intel/llvm-test-suite#1679 Signed-off-by: Sarnie, Nick <[email protected]>
|
Thanks @whitneywhtsang . |
Similar to e74334e. cgeist is unable to handle it correctly. Signed-off-by: Tsang, Whitney <[email protected]>
Merged commit dbcd0be into https://github.com/intel/llvm/tree/sycl-mlir. |
Please only review dbcd0be, which is similar to e74334e,
detail::dim_loop<>
cannot be handled correctly bycgeist
.Temporally reverted 130466e to unblock the merge, modified #8616 to investigate and add it back.
Reverted a0d0942 due to cmake data race described in #8774 (comment).
Please do not squash and merge this PR.