[clang-sycl-linker][clang-linker-wrapper] Move sycl linking code from linker wrapper to sycl linker #1

asudarsa · 2025-04-03T04:30:05Z

This is a separate branch that will be maintained to alleviate merge pain later.

Thanks

Confirmed working in David's PR [here](https://github.com/intel/llvm/actions/runs/14227019977/job/39869445580?pr=17335), other failures there unrelated. We hit this now because CMake now finds the system OpenCL in a different path that happens to contain a space. Signed-off-by: Sarnie, Nick <[email protected]>

Theory from [here](intel#17774 (comment)) is we need to match the version from [here](https://github.com/intel/compute-runtime/blob/25.09.32961.7/manifests/manifest.yml#L50-L56). Signed-off-by: Sarnie, Nick <[email protected]>

Reverts intel#17778 Same problem

Bumps [tar-fs](https://github.com/mafintosh/tar-fs) from 2.1.1 to 2.1.2. <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/mafintosh/tar-fs/commit/d97731b0e1b8a244ab859784b514cfcf5585ad3d"><code>d97731b</code></a> 2.1.2</li> <li><a href="https://github.com/mafintosh/tar-fs/commit/fd1634e869e7c5f85948e95eabdaa8451a085de5"><code>fd1634e</code></a> symlink tweak from main</li> <li>See full diff in <a href="https://github.com/mafintosh/tar-fs/compare/v2.1.1...v2.1.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tar-fs&package-manager=npm_and_yarn&previous-version=2.1.1&new-version=2.1.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/intel/llvm/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Fix Coverity reported auto-copy issues.

…elational.h into libspirv (intel#17793) clc_add_sat.h/clc_sub_sat.h become redundant after recent clc changes. generic/include/relational.h is now only used by libspirv.

We need to skip the check for private memory, otherwise OpenCL CPU device may generate false positive reports due to stack re-use in different threads. However, SPIR-V builts 'ToPrivate' doesn't work as expected on OpenCL CPU device. So we need to manually cleanup private shadow before each function exit point.

SYCL RT addImages function may be invoked multiple times for different sycl binary images, more than 1 of these sycl binary images may depend on bfloat16 device library. These bfloat16 device library images are provided by compiler and the implementation are stable now, so we only keep single copy for native and fallback version bfloat16 device library in program manager, these images will not be removed unless program manager is destroyed. --------- Signed-off-by: jinge90 <[email protected]>

…#17773) Initializes MID like in all the other constructors.

Also align some strings with regular sycl-nightly

…te (intel#17645) When updating a HostTask using whole graph update, we rely on the Sycl Scheduler to execute the HostTask and all its dependencies. The Scheduler executes this work asynchronously using a separate thread. Since exec_graph.update() was non-blocking, this was causing issues, where the user code could, for example, free resources, before the update() command (which is a dependent on the HostTask completion) could execute. This commit fixes this issue by making exec_graph.update() blocking when host_tasks are used. Co-authored-by: Ewan Crawford <[email protected]>

getSignalEvent in ur_command_list_manager returned nullptr when queue was not set. This was the case when initializing the command list manager in queue ctor called from urQueueCreateWithNativeHandle. Fix this by passing queue to the command list manager and removing the default nullptr value from the ctor (to avoid this issue in future). Also, even if queue is nullptr, return the event instead of nullptr from getSignalEvent. This event is mostly functional, it just cannot be used for profiling.

- Include node type in error message - Fix unit test triggering topology error instead of unsupported node type error Example output of new error: ``` terminate called after throwing an instance of 'sycl::_V1::exception' what(): node_type::memfill nodes are not supported for update. Only kernel, host_task, barrier and empty nodes are supported. ``` Co-authored-by: Ewan Crawford <[email protected]>

To support the SYCL-Graph extension on an OpenCL backend, we currently only require the presence of the `cl_khr_command_buffer` extension. This PR introduces an extra requirement on the [CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR) capability being present. This is based on the [graph execution wording](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc#765-new-handler-member-functions) on the definition of `handler::ext_oneapi_graph()` that: > Only one instance of graph will execute at any time. If graph is submitted multiple times, dependencies are automatically added by the runtime to prevent concurrent executions of an identical graph. Such usage results in multiple calls by the SYCL runtime to `urEnqueueCommandBufferExp` with the same UR command-buffer and event dependencies to prevent concurrent execution. Without support for simultaneous-use the OpenCL adapter code cannot guarantee that the first command-buffer submission has finished execution before it makes following `clEnqueueCommandBufferKHR` calls with the `cl_event` decencies. If the first submission is still executing, then an error will be reported. Workarounds like adding blocking host waits to the OpenCL UR adapter are possible, but requiring simultaneous use reflects the vendor requirements as they are for the currently implementation. I've tried to document this all in the UR spec and SYCL-Graph design docs, which also includes a couple of cleanups I found along the way. Note that the new CTS test fails for Level-Zero adapter, which I've created intel#17734 to resolve. --------- Co-authored-by: Mikołaj Komar <[email protected]>

This patch simplifies the stream queue constructor by using in-class initialization when appropriate. And uses the constructors to initialize the stream vectors.

…intel#17821) This removes one dependency on OpenCL library.

…l#17805)

Scheduled igc dev drivers uplift Co-authored-by: GitHub Actions <[email protected]>

This PR no longer generates thread_local pointers for kernels calling other kernels, which happens for example in the work_item loop. Instead of storing the state struct pointer in the thread local, it is passed directly to the called kernel function which was duplicated with an additional state struct pointer parameter if it didn't already have one. The state getter functions (native_cpu state and corresponding mux and nativecpu spirv functions) have been made __attribute((pure)) to enable more optimizations (including removal of unused calls to such builtins) before the NativeCPU passes. Pointer parameters of the native_cpu getter functions now point to constant data.

…l#17811)

The V2 adapter now has all required functionality.

To fix compilation issue with the newest compute-runtime

by avoiding unnecessary temp shared_ptr creation

It's no longer needed after SYCLPropagateAspectsUsage pass. --------- Signed-off-by: Sidorov, Dmitry <[email protected]>

by using `const auto &` instead of `auto`.

…el#17757) Our Ubuntu 22.04 container has CUDA 12.1 installed while Ubuntu 24.04 image has CUDA 12.6.1 installed. If a PR changes CUDA adapter, we should test the change with both CUDA versions.

…tel#17834) This test requires a significant amount of host memory. It has been observed that sometimes (very rarely) the test may fail with OUT_OF_HOST_MEMORY error, especially when run in parallel with other "high-overhead" tests. Refer CMPLRLLVM-66341. This PR makes the test ignore failure if that happens. An alternative is to check for the available host memory and skip the test if it is too low. However, that approach is still susceptible to race conditions.

) A number of changes to UR won't have any observable effects on SYCL builds, so no point in running a full SYCL build and e2e test for them.

Co-authored-by: omarahmed1111 <[email protected]>

image_get_info.cpp previously comprised three types of tests which have been split into separate test files - testing queries of a `oneapi::experimental::image_mem` instance (image_get_info.cpp) - testing device queries for image requirements (either max image dimensions or pitch alignment) (image_reqs_get_info.cpp) - testing device queries for bindless images aspects (bindless_aspects.cpp) This PR splits up these tests which is useful for diagnosing problems on different backends: - image_get_info.cpp is currently only unsupported on the HIP backend (Known issue) - image_reqs_get_info.cpp is currently only unsupported on the L0 backend intel#17663 - bindless_aspects.cpp has been fixed on L0 backend so it now passes on all backends, by only querying `mipmap_max_anisotropy` if `aspect::ext_oneapi_mipmap_anisotropy` returns true --------- Signed-off-by: JackAKirk <[email protected]>

The driver I have at least has a number of issues, the largest of which is that it lacks an IR compiler. This updates tests so that they are skipped if required.

This patch removes unused handler_impl members.

Address issue intel#16451 , where property `use_root_sync` is not processed properly. Also revised `sycl/test-e2e/GroupAlgorithm/root_group.cpp` to not use the deprecated version of `parallel_for`. (Which was previously blocked by this issue about `use_root_sync`). Also here's some explanation for the change in `handler.hpp`: This is where the previous code doesn't handle `use_root_sync` correctly: `processLaunchProperties` will be called twice, first for the property list returned by the kernel functor's `get(properties_tag)` method, and then for `Props` that is passed in as a parameter to `parallel_for`. Therefore, if the `get(properties_tag)` method specifies `use_root_sync` and `Props` is empty or doesn't contain `use_root_sync`, what will be done is: - first, the property list returned by the kernel functor's `get(properties_tag)` method get processed. And since it contains `use_root_sync`, `setKernelIsCooperative(true)` is called; - then, the property list `Props` that is passed in as a parameter to `parallel_for` get processed. And since it doesn't contain `use_root_sync` (actually for the non-deprecated variants of `parallel_for`, `Props` should always be an empty property list), `setKernelIsCooperative(**false**)` is called And thus in the end the `MKernelIsCooperative` flag will be set to false, while it actually should be true. Revising the code like this solve the problem. Also `MKernelIsCooperative` is false by default, so we don't need to worry if `setKernelIsCooperative` is not called. --------- Signed-off-by: Hu, Peisen <[email protected]>

This contains a fix for google/googletest#4036 , which we encounter if we are outputting JSON and the test runner finds no adapters.

This PR disables flaky fill test for command buffer that was enabled in intel#17709. The issue is connected to the bug in the driver that is patched in new version, but the CI machines still have the old one, which causes it to sometimes fail (for example https://github.com/intel/llvm/actions/runs/14250564960/job/39942652796?pr=17836)

* Removes expression trees support * Applies all previous vec fixes to align with recent spec changes

Fix repo_ref value. I probably didn't notice it when I changed the others.

This tests that converting a UR handle to a native handle, then back into a UR handle preserves/recreates any handles contained within it. The event test is a special case; since urEventCreateWithNativeHandle doesn't accept a queue paramater. For this, we tollerate a different queue being created, as long as they both have the same underlying native handle.

This patch adds a step to conditionally pack pure sycl-toolchain for release.

We've removed host device support some time ago.

This commit updates OCK to apply a code generation fix that will affect upcoming NativeCPU changes, as well as a forward compatibility fix to prevent issues with the next LLVM pulldown or the one thereafter.

Signed-off-by: Arvind Sudarsanam <[email protected]>

sarnex and others added 30 commits April 2, 2025 20:58

[CI] Bump Level Zero version to 1.20.6 (intel#17778)

7d7257c

Theory from [here](intel#17774 (comment)) is we need to match the version from [here](https://github.com/intel/compute-runtime/blob/25.09.32961.7/manifests/manifest.yml#L50-L56). Signed-off-by: Sarnie, Nick <[email protected]>

Revert "[CI] Bump Level Zero version to 1.20.6" (intel#17819)

363faec

Reverts intel#17778 Same problem

[DviceSanitizer][Coverity] Fix auto-copy issues (intel#17794)

f1dfa5e

Fix Coverity reported auto-copy issues.

[NFC][libclc] Replace redundant integer header files with clc, move r…

62ba344

…elational.h into libspirv (intel#17793) clc_add_sat.h/clc_sub_sat.h become redundant after recent clc changes. generic/include/relational.h is now only used by libspirv.

[SYCL][Graph] Fix unitialized member of dynamic_parameter_impl (intel…

027afc7

…#17773) Initializes MID like in all the other constructors.

[CI][sycl-rel] Add BMG to sycl-rel-nightly (intel#17782)

3bd7258

Also align some strings with regular sycl-nightly

[CI][Benchmarks] add driver information to runs (intel#17797)

1426736

[UR][CUDA][HIP] Create stream vectors in queue constructor (intel#17823)

6395951

This patch simplifies the stream queue constructor by using in-class initialization when appropriate. And uses the constructors to initialize the stream vectors.

[NFC][libclc][libspirv] Replace OpenCL as_TYPE with clc __clc_as_TYPE (…

4ec3e5b

…intel#17821) This removes one dependency on OpenCL library.

[UR] Include level zero headers for native command buffers test (inte…

8c8c403

…l#17805)

[GHA] Uplift Linux IGC Dev RT version to igc-dev-b74b7ab (intel#17820)

7129c43

Scheduled igc dev drivers uplift Co-authored-by: GitHub Actions <[email protected]>

[SYCL][E2E][NFC] Refactor Basic/image/image_max_size.cpp test (inte…

6e3a0ef

…l#17811)

[SYCL][E2E] Enable graph tests for L0 v2 adapter (intel#17756)

d2f4596

The V2 adapter now has all required functionality.

[SYCL][UR][Bench] Bump IGC depdency version fro benchmarks (intel#17776)

e9d13a5

To fix compilation issue with the newest compute-runtime

[SYCL] Optimize finalizeHandlerPostProcess (intel#17791)

a485cbf

by avoiding unnecessary temp shared_ptr creation

[NFCI][SYCL] Remove srcloc metadata before sycl-post-link (intel#17727)

3a353b1

It's no longer needed after SYCLPropagateAspectsUsage pass. --------- Signed-off-by: Sidorov, Dmitry <[email protected]>

[SYCL][Graph] Removing unnecessary copy of KernelName (intel#17833)

7369251

by using `const auto &` instead of `auto`.

[CI] Run build on Ubuntu 22, pre-commit if CUDA adapter changes. (int…

5b35190

…el#17757) Our Ubuntu 22.04 container has CUDA 12.1 installed while Ubuntu 24.04 image has CUDA 12.6.1 installed. If a PR changes CUDA adapter, we should test the change with both CUDA versions.

[UR] Add codeowners for L0 and CL adapters (intel#17810)

10e20bc

[CI] Don't run SYCL testing for changes that only affect UR (intel#17808

48a08bf

) A number of changes to UR won't have any observable effects on SYCL builds, so no point in running a full SYCL build and e2e test for them.

RossBrunton and others added 21 commits April 4, 2025 11:15

[UR] Add handles to opencl adapter (intel#17572)

42ee42e

Co-authored-by: omarahmed1111 <[email protected]>

[UR] Improve testing for gfx1100 on OpenCL (intel#17747)

92da52d

The driver I have at least has a number of issues, the largest of which is that it lacks an IR compiler. This updates tests so that they are skipped if required.

[AsyncAlloc][SYCL][Exp] Remove unused handler members (intel#17831)

e69e779

This patch removes unused handler_impl members.

[UR] Bump googletest version (intel#17852)

d5153ae

This contains a fix for google/googletest#4036 , which we encounter if we are outputting JSON and the test runner finds no adapters.

[NFCI][SYCL] Refactor handler::unpack (intel#17838)

5d1dedd

[SYCL] Re-implement swizzles from scratch (intel#17817)

63ab1cb

* Removes expression trees support * Applies all previous vec fixes to align with recent spec changes

[CI] Update sycl-rel-nightly.yml (intel#17845)

a123a74

Fix repo_ref value. I probably didn't notice it when I changed the others.

[CI][Benchmarks] Update compute-runtime (intel#17851)

3d70752

[CI] Pack release build (intel#17781)

37916ce

This patch adds a step to conditionally pack pure sycl-toolchain for release.

[NFCI][SYCL] DPCPP_HOST_DEVICE_* can't be set (intel#17816)

bf2b87f

We've removed host device support some time ago.

[SYCL][NativeCPU] Update OCK. (intel#17862)

8ceaf1d

This commit updates OCK to apply a code generation fix that will affect upcoming NativeCPU changes, as well as a forward compatibility fix to prevent issues with the next LLVM pulldown or the one thereafter.

temp

ff9376d

Signed-off-by: Arvind Sudarsanam <[email protected]>

Added all relevant code into clang-sycl-linker

ae9efdc

Signed-off-by: Arvind Sudarsanam <[email protected]>

Removed spirv dump option and replaced std::string with StringRef

a87e4f1

Signed-off-by: Arvind Sudarsanam <[email protected]>

Add logic to write images and bundle

380dc5f

Signed-off-by: Arvind Sudarsanam <[email protected]>

Changes to clang-linker-wrapper to align with community code

81fba25

Signed-off-by: Arvind Sudarsanam <[email protected]>

Initial working flow

94d5235

Signed-off-by: Arvind Sudarsanam <[email protected]>

asudarsa force-pushed the move_sycl_linking_code_from_linker_wrapper_to_sycl_linker branch from e0e1256 to 94d5235 Compare April 4, 2025 23:01

asudarsa marked this pull request as ready for review April 4, 2025 23:03

asudarsa closed this Apr 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[clang-sycl-linker][clang-linker-wrapper] Move sycl linking code from linker wrapper to sycl linker #1

[clang-sycl-linker][clang-linker-wrapper] Move sycl linking code from linker wrapper to sycl linker #1

Uh oh!

asudarsa commented Apr 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

24 participants

[clang-sycl-linker][clang-linker-wrapper] Move sycl linking code from linker wrapper to sycl linker #1

[clang-sycl-linker][clang-linker-wrapper] Move sycl linking code from linker wrapper to sycl linker #1

Uh oh!

Conversation

asudarsa commented Apr 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

24 participants