Skip to content

Conversation

@asudarsa
Copy link
Owner

@asudarsa asudarsa commented Apr 3, 2025

This is a separate branch that will be maintained to alleviate merge pain later.

Thanks

sarnex and others added 30 commits April 2, 2025 20:58
Confirmed working in David's PR
[here](https://github.com/intel/llvm/actions/runs/14227019977/job/39869445580?pr=17335),
other failures there unrelated.

We hit this now because CMake now finds the system OpenCL in a different
path that happens to contain a space.

Signed-off-by: Sarnie, Nick <[email protected]>
Bumps [tar-fs](https://github.com/mafintosh/tar-fs) from 2.1.1 to 2.1.2.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/mafintosh/tar-fs/commit/d97731b0e1b8a244ab859784b514cfcf5585ad3d"><code>d97731b</code></a>
2.1.2</li>
<li><a
href="https://github.com/mafintosh/tar-fs/commit/fd1634e869e7c5f85948e95eabdaa8451a085de5"><code>fd1634e</code></a>
symlink tweak from main</li>
<li>See full diff in <a
href="https://github.com/mafintosh/tar-fs/compare/v2.1.1...v2.1.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tar-fs&package-manager=npm_and_yarn&previous-version=2.1.1&new-version=2.1.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/intel/llvm/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fix Coverity reported auto-copy issues.
…elational.h into libspirv (intel#17793)

clc_add_sat.h/clc_sub_sat.h become redundant after recent clc changes.
generic/include/relational.h is now only used by libspirv.
We need to skip the check for private memory, otherwise OpenCL CPU
device may generate false positive reports due to stack re-use in
different threads. However, SPIR-V builts 'ToPrivate' doesn't work as
expected on OpenCL CPU device. So we need to manually cleanup private
shadow before each function exit point.
SYCL RT addImages function may be invoked multiple times for different
sycl binary images, more than 1 of these sycl binary images may depend
on bfloat16 device library. These bfloat16 device library images are
provided by compiler and the implementation are stable now, so we only
keep single copy for native and fallback version bfloat16 device library
in program manager, these images will not be removed unless program
manager is destroyed.

---------

Signed-off-by: jinge90 <[email protected]>
…#17773)

Initializes MID like in all the other constructors.
Also align some strings with regular sycl-nightly
…te (intel#17645)

When updating a HostTask using whole graph update, we rely
on the Sycl Scheduler to execute the HostTask and all its
dependencies. The Scheduler executes this work asynchronously
using a separate thread. Since exec_graph.update() was non-blocking,
this was causing issues, where the user code could, for example,
free resources, before the update() command (which is a dependent on the
HostTask completion) could execute.

This commit fixes this issue by making exec_graph.update() blocking
when host_tasks are used.

Co-authored-by: Ewan Crawford <[email protected]>
getSignalEvent in ur_command_list_manager returned nullptr when queue
was not set. This was the case when initializing the command list
manager in queue ctor called from urQueueCreateWithNativeHandle.

Fix this by passing queue to the command list manager and removing the
default nullptr value from the ctor (to avoid this issue in future).

Also, even if queue is nullptr, return the event instead of nullptr from
getSignalEvent. This event is mostly functional, it just cannot be used
for profiling.
- Include node type in error message
- Fix unit test triggering topology error instead of unsupported node
type error

Example output of new error:

```
terminate called after throwing an instance of 'sycl::_V1::exception'
  what():  node_type::memfill nodes are not supported for update. Only kernel, host_task, barrier and empty nodes are supported.
```

Co-authored-by: Ewan Crawford <[email protected]>
To support the SYCL-Graph extension on an OpenCL backend, we currently
only require the presence of the `cl_khr_command_buffer` extension. This
PR introduces an extra requirement on the
[CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR)
capability being present.

This is based on the [graph execution
wording](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc#765-new-handler-member-functions)
on the definition of `handler::ext_oneapi_graph()` that:

> Only one instance of graph will execute at any time. If graph is
submitted multiple times, dependencies are automatically added by the
runtime to prevent concurrent executions of an identical graph.

Such usage results in multiple calls by the SYCL runtime to
`urEnqueueCommandBufferExp` with the same UR command-buffer and event
dependencies to prevent concurrent execution. Without support for
simultaneous-use the OpenCL adapter code cannot guarantee that the first
command-buffer submission has finished execution before it makes
following `clEnqueueCommandBufferKHR` calls with the `cl_event`
decencies. If the first submission is still executing, then an error
will be reported.

Workarounds like adding blocking host waits to the OpenCL UR adapter are
possible, but requiring simultaneous use reflects the vendor
requirements as they are for the currently implementation. I've tried to
document this all in the UR spec and SYCL-Graph design docs, which also
includes a couple of cleanups I found along the way.

Note that the new CTS test fails for Level-Zero adapter, which I've
created intel#17734 to resolve.

---------

Co-authored-by: Mikołaj Komar <[email protected]>
This patch simplifies the stream queue constructor by using in-class
initialization when appropriate. And uses the constructors to initialize
the stream vectors.
Scheduled igc dev drivers uplift

Co-authored-by: GitHub Actions <[email protected]>
This PR no longer generates thread_local pointers for kernels calling
other kernels, which happens for example in the work_item loop. Instead
of storing the state struct pointer in the thread local, it is passed
directly to the called kernel function which was duplicated with an
additional state struct pointer parameter if it didn't already have one.
The state getter functions (native_cpu state and corresponding mux and
nativecpu spirv functions) have been made __attribute((pure)) to enable
more optimizations (including removal of unused calls to such builtins)
before the NativeCPU passes.
Pointer parameters of the native_cpu getter functions now point to
constant data.
The V2 adapter now has all required functionality.
To fix compilation issue with the newest compute-runtime
by avoiding unnecessary temp shared_ptr creation
It's no longer needed after SYCLPropagateAspectsUsage pass.

---------

Signed-off-by: Sidorov, Dmitry <[email protected]>
…el#17757)

Our Ubuntu 22.04 container has CUDA 12.1 installed while Ubuntu 24.04
image has CUDA 12.6.1 installed.
If a PR changes CUDA adapter, we should test the change with both CUDA
versions.
…tel#17834)

This test requires a significant amount of host memory. It has been
observed that sometimes (very rarely) the test may fail with
OUT_OF_HOST_MEMORY error, especially when run in parallel with other
"high-overhead" tests. Refer CMPLRLLVM-66341.

This PR makes the test ignore failure if that happens. An alternative is
to check for the available host memory and skip the test if it is too
low. However, that approach is still susceptible to race conditions.
)

A number of changes to UR won't have any observable effects on SYCL
builds, so no point in running a full SYCL build and e2e test for them.
RossBrunton and others added 21 commits April 4, 2025 11:15
image_get_info.cpp previously comprised three types of tests which have
been split into separate test files

- testing queries of a `oneapi::experimental::image_mem` instance
(image_get_info.cpp)
- testing device queries for image requirements (either max image
dimensions or pitch alignment) (image_reqs_get_info.cpp)
- testing device queries for bindless images aspects
(bindless_aspects.cpp)

This PR splits up these tests which is useful for diagnosing problems on
different backends:

- image_get_info.cpp is currently only unsupported on the HIP backend
(Known issue)
- image_reqs_get_info.cpp is currently only unsupported on the L0
backend intel#17663
- bindless_aspects.cpp has been fixed on L0 backend so it now passes on
all backends, by only querying `mipmap_max_anisotropy` if
`aspect::ext_oneapi_mipmap_anisotropy` returns true

---------

Signed-off-by: JackAKirk <[email protected]>
The driver I have at least has a number of issues, the largest of which
is that it lacks an IR compiler. This updates tests so that they are
skipped if required.
This patch removes unused handler_impl members.
Address issue intel#16451 , where
property `use_root_sync` is not processed properly. Also revised
`sycl/test-e2e/GroupAlgorithm/root_group.cpp` to not use the deprecated
version of `parallel_for`. (Which was previously blocked by this issue
about `use_root_sync`).

Also here's some explanation for the change in `handler.hpp`:
This is where the previous code doesn't handle `use_root_sync`
correctly: `processLaunchProperties` will be called twice, first for the
property list returned by the kernel functor's `get(properties_tag)`
method, and then for `Props` that is passed in as a parameter to
`parallel_for`. Therefore, if the `get(properties_tag)` method specifies
`use_root_sync` and `Props` is empty or doesn't contain `use_root_sync`,
what will be done is:

- first, the property list returned by the kernel functor's
`get(properties_tag)` method get processed. And since it contains
`use_root_sync`, `setKernelIsCooperative(true)` is called;
- then, the property list `Props` that is passed in as a parameter to
`parallel_for` get processed. And since it doesn't contain
`use_root_sync` (actually for the non-deprecated variants of
`parallel_for`, `Props` should always be an empty property list),
`setKernelIsCooperative(**false**)` is called
And thus in the end the `MKernelIsCooperative` flag will be set to
false, while it actually should be true. Revising the code like this
solve the problem.

Also `MKernelIsCooperative` is false by default, so we don't need to
worry if `setKernelIsCooperative` is not called.

---------

Signed-off-by: Hu, Peisen <[email protected]>
This contains a fix for google/googletest#4036 ,
which we encounter if we are outputting JSON and the test runner finds
no adapters.
This PR disables flaky fill test for command buffer that was enabled in
intel#17709.
The issue is connected to the bug in the driver that is patched in new
version, but the CI machines still have the old one, which causes it to
sometimes fail (for example
https://github.com/intel/llvm/actions/runs/14250564960/job/39942652796?pr=17836)
* Removes expression trees support
* Applies all previous vec fixes to align with recent spec changes
Fix repo_ref value. I probably didn't notice it when I changed the
others.
This tests that converting a UR handle to a native handle, then back
into a UR handle preserves/recreates any handles contained within it.

The event test is a special case; since urEventCreateWithNativeHandle
doesn't accept a queue paramater. For this, we tollerate a different
queue being created, as long as they both have the same underlying
native handle.
This patch adds a step to conditionally pack pure sycl-toolchain for
release.
We've removed host device support some time ago.
This commit updates OCK to apply a code generation fix that will affect
upcoming NativeCPU changes, as well as a forward compatibility fix to
prevent issues with the next LLVM pulldown or the one thereafter.
Signed-off-by: Arvind Sudarsanam <[email protected]>
Signed-off-by: Arvind Sudarsanam <[email protected]>
Signed-off-by: Arvind Sudarsanam <[email protected]>
@asudarsa asudarsa force-pushed the move_sycl_linking_code_from_linker_wrapper_to_sycl_linker branch from e0e1256 to 94d5235 Compare April 4, 2025 23:01
@asudarsa asudarsa marked this pull request as ready for review April 4, 2025 23:03
@asudarsa asudarsa closed this Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.