Skip to content

[SYCL-MLIR] Merge from intel/llvm sycl branch #8774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 45 commits into from

Conversation

whitneywhtsang
Copy link
Contributor

@whitneywhtsang whitneywhtsang commented Mar 24, 2023

Please only review dbcd0be, which is similar to e74334e, detail::dim_loop<> cannot be handled correctly by cgeist.

Temporally reverted 130466e to unblock the merge, modified #8616 to investigate and add it back.

Reverted a0d0942 due to cmake data race described in #8774 (comment).

Please do not squash and merge this PR.

Fznamznon and others added 30 commits March 21, 2023 13:49
Due to incorrect substitution wrong command line was formed on Windows.
According to the SYCL 2020 specification, the reducer class should be
neither moveable nor copyable. This commit deletes these constructors
and assignment operators.

Fixes intel#6065

---------

Signed-off-by: Larsen, Steffen <[email protected]>
This patch adds marray support to all functions from Table 179 of SYCL
2020 spec + to functions fabs, ilogb, fmax, fmin, ldexp, pown, rootn
from Table 175 + to function exp10 from Table 177.

E2E tests: intel/llvm-test-suite#1656

---------

Co-authored-by: KornevNikita <[email protected]>
Recent changes to the sycl::vec class added a private constructor taking
and array. This resulted in constructor ambiguity when passing an
initializer list, despite the constructor being private. This commit
removes the constructor, making the implementation use the constructor
taking both an array and an index sequence directly.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
Co-authored-by: Alexey Sachkov <[email protected]>
In certain cases get_specialization_constant would cause a segmentation
fault, likely due to strict aliasing violations. This commit changes the
implementation to use memcpy of the data into the resulting object, as
this can be assumed to be valid due to specialization constants being
device-copyable.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
…l#8713)

This commit makes a clarification to sycl_ext_oneapi_weak_object that
the new interfaces are only available on the host application.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
Co-authored-by: Greg Lueck <[email protected]>
…ted (intel#8523)

When fetching device images compatible with non-input states, we can
ignore an image if another one with a higher state is available for all
the possible kernel-device pairs. This patch adds the logic for
filtering out such unnecessary images so that we can avoid JIT
compilation if both AOT and SPIRV images are present.
This commit splits aspect propagation into two runs:
1. First run propagates all aspects, except fp64. Warnings are still
issued for fp64 as if it was fully propagated, but the resulting
metadata will not reflect it. This run before optimizations.
2. Second run propagates all aspects, including fp64. This should not
have any effect on already propagated aspects. This run will not issue
warnings as any conflicts would have been reported by the first pass.

See the [design
document](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/OptionalDeviceFeatures.md#pre--and-post-optimization-aspect-propagation)
for more information.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
SYCL 2020 has any and all return bool for scalar and marray arguments,
but int for vector arguments. Currently enabling
SYCL2020_CONFORMANT_APIS switches all versions of any and all to return
bool. This commit changes it so that the variants taking vector
arguments are unaffected by SYCL2020_CONFORMANT_APIS and language
version.

Signed-off-by: Larsen, Steffen <[email protected]>
This commit fixes an issue where the copy and move assignment operators
of the weak_object class would be implicitly deleted due to them being
missing from the base class.

Signed-off-by: Larsen, Steffen <[email protected]>
This commit adds the missing SYCL 2020 is_group trait and changes the
definition of is_group_v to be in line with SYCL 2020.

Fixes intel#8704.

Signed-off-by: Larsen, Steffen <[email protected]>
This was identified by the Unified Runtime team, depending on the target
order in `-fsycl-targets` certain binaries are not dumped.
```
-fsycl-targets=nvptx64-nvidia-cuda,spir64

├── sycl_nvptx641.bin
├── sycl_spir642.spv
└── sycl_spir643.spv
```
```
-fsycl-targets=spir64,nvptx64-nvidia-cuda

├── sycl_spir641.spv
├── sycl_spir642.spv
└── sycl_spir643.spv
```
This patch makes sure that if a kernel is compiled for multiple targets,
images belonging to different targets will be dumped.
…r::get_pointer. (intel#8493)

* accessor::get_pointer and local_accessor::get_pointer return
`std::add_pointer_t<value_type>`
* Modifies multi_ptr ctors accepting local_accessor.
* Improves existing test to check the return type of get_pointer.

---------

Co-authored-by: Steffen Larsen <[email protected]>
This patch fixes an outdated line number that is referenced in the
documentation.
…enCL and Level Zero (intel#8595)

Adds support to query devices for `atomic_memory_scope_capabilities`.
The backends supported are OpenCL and Level Zero. For the rest of
backends, it has been left unsupported.

---------

Signed-off-by: Maronas, Marcos <[email protected]>
Correct a few values in compileTimeProperties according to SPIRV specs.
And skip collecting sycl-alignment to SPIRV.ParamDecoration metadata
This commit reenables the fp64 aspect check when ensuring compatibility
of a given device image, following the relaxation of aspect propagation
for the aspect.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
…ntel#8517)

This patch implements the `atomic_memory_order_capabilities` query in
the OpenCL and Level Zero backends/plugins for `device` and `context`

Specifically: 
- OpenCL <2.0 returns the minimum required capability set (`relaxed`)
defined in [Section 4.2 of the OpenCL 3.0
specification](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES).
- OpenCL <3.0 and Level Zero backends return all memory order
capabilities.
- OpenCL >=3.0 queries the actual device to get the supported memory
order capabilities.

E2E test have also been updated to reflect these changes:
intel/llvm-test-suite#1627
Tests for min and max builtins fail on Windows with recent changes. This
is due to these being defined as macros, causing their use inside the
generators to fail. This commit fixes this by enclosing them in
parentheses.

Fixes intel#8717.

Signed-off-by: Larsen, Steffen <[email protected]>
All tools are switched to opaque pointers already except those we use in
compilation for offload targets.
intel#8743)

When processing archives when performing offload, the driver will
perform a host link which generates a file that is used to determine any
offload dependencies. This additional link step is just for gathering
the dependency information and shouldn't require the full ability to
link.

Under some circumstances, for example generating an early or final image
archive for FPGA, the symbol resolution is not important. Add a general
case of -z undefs during this internal link step and allow the final
host link to determine any unresolved symbols at that time.
Aligns local accessor index calculation with the other accessors.
bfloat16_wrapper.cpp requires SYCL headers, so ensure that they've been
staged to the build directory before compiling.
igchor and others added 2 commits March 24, 2023 16:02
OwnZeMemHandle is used to track whether an allocation should be freed by
the runtime or not.

There is no point in passing this flag to USMAllocator and then to
USMFreeHelper since the allocation cannot come from the USMAllocator (it
can only be from zeMemAlloc*).

I've thought about moving the memory allocation code to UR first, but
there are a lot of dependencies: we would have to move the
implementation of context, device, event, and queue at least. It seemed
that doing this small change first made more sense.
…#8247)

This commit changes the configuration of in-tree SYCL LIT tests to
include features for when the L0 and OpenCL plugins are enabled. This
allows the interop backend tests to be split into backend-dependent
tests to be run only when their backends are enabled. Additionally, this
commit changes the fsycl-host-only LIT configuration replacement to
ignore unset include paths, avoiding empty "-isystem" options when
related backends are disabled.

Signed-off-by: Larsen, Steffen <[email protected]>
Compiling kernels for it is not supported, device info sometimes throws,
and there is the HIP backend for those wanting to use an AMD GPU.

```console
$ sycl-ls  # before
[opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 770 [0x4680] 3.0 [22.35.24055]
[opencl:gpu:1] AMD Accelerated Parallel Processing, gfx1032 2.0 [3452.0 (HSA1.1,LC)]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 [0x4680] 1.3 [1.3.24055]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 11.7]
[ext_oneapi_hip:gpu:0] AMD HIP BACKEND, gfx1032 0.0 [HIP 50221.15]

$ sycl-ls  # after
[opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 770 [0x4680] 3.0 [22.35.24055]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 [0x4680] 1.3 [1.3.24055]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 11.7]
[ext_oneapi_hip:gpu:0] AMD HIP BACKEND, gfx1032 0.0 [HIP 50221.15]
```

Closes intel#5825
@whitneywhtsang
Copy link
Contributor Author

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully.
@AlexeySachkov Any idea why we are getting the error here?

@bader
Copy link
Contributor

bader commented Mar 24, 2023

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully. @AlexeySachkov Any idea why we are getting the error here?

@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this?

@jandres742
Copy link
Contributor

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully. @AlexeySachkov Any idea why we are getting the error here?

@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this?

@bader , @whitneywhtsang :

I cannot reproduce it locally, but it might be other change #8637. could you try w/o that patch?

@bader
Copy link
Contributor

bader commented Mar 25, 2023

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully. @AlexeySachkov Any idea why we are getting the error here?

@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this?

@bader , @whitneywhtsang :

I cannot reproduce it locally, but it might be other change #8637. could you try w/o that patch?

It's hard to say for sure if reverting this patch removes the data race, but a few tries went okay.

@whitneywhtsang
Copy link
Contributor Author

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully. @AlexeySachkov Any idea why we are getting the error here?

@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this?

@bader , @whitneywhtsang :
I cannot reproduce it locally, but it might be other change #8637. could you try w/o that patch?

It's hard to say for sure if reverting this patch removes the data race, but a few tries went okay.

Linux / Build + LIT passes with #8637 reverted.

@jandres742
Copy link
Contributor

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully. @AlexeySachkov Any idea why we are getting the error here?

@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this?

@bader , @whitneywhtsang :
I cannot reproduce it locally, but it might be other change #8637. could you try w/o that patch?

It's hard to say for sure if reverting this patch removes the data race, but a few tries went okay.

Linux / Build + LIT passes with #8637 reverted.

I cannot reproduce the error from Linux / Build + LIT locally:

/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_common.hpp:10:10: fatal error: sycl/detail/pi.h: No such file or directory
   10 | #include <sycl/detail/pi.h>
      |          ^~~~~~~~~~~~~~~~~~

The change came from #8517, and it could be built successfully. @AlexeySachkov Any idea why we are getting the error here?

@jandres742, I've seen this issue on my system too. It disappears when I do incremental build, so I think there is some missing dependency in cmake files. Do you know anything about this?

@bader , @whitneywhtsang :
I cannot reproduce it locally, but it might be other change #8637. could you try w/o that patch?

It's hard to say for sure if reverting this patch removes the data race, but a few tries went okay.

Linux / Build + LIT passes with #8637 reverted.

Thanks @whitneywhtsang .

@bader : do we need then to revert #8637 on sycl branch?

Similar to e74334e.
cgeist is unable to handle it correctly.

Signed-off-by: Tsang, Whitney <[email protected]>
@whitneywhtsang whitneywhtsang marked this pull request as ready for review March 25, 2023 21:15
@whitneywhtsang
Copy link
Contributor Author

Merged commit dbcd0be into https://github.com/intel/llvm/tree/sycl-mlir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs sycl-mlir Pull requests or issues for sycl-mlir branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.