[SYCL][Graph] Add spec wording for graph-owned memory allocations #384

Bensuo · 2025-03-21T15:12:55Z

Using sycl_ext_codeplay_async_memory_alloc extension
Spec wording for graph support of the feature
Usage guide examples of explicit and queue recording usage

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc

sycl/doc/syclgraph/SYCLGraphUsageGuide.md

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc

AerialMantis · 2025-04-02T10:32:18Z

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc

+dependencies are correct. Using a pointer in a graph command ordered after it
+has been freed via an `async_free` node results in undefined behavior.
+
+The total amount of physical memory being used by a graph can be queried using


I think we may want to have a separate query or have different return value, depending on whether the graph is modifiable or finalised. When it's modifiable, this is before any optimisations are necessarily applied, so it would likely have to be the highest potential memory footprint for the graph. But then once it's finalised and optimisations for memory re-use have been applied, it can tell you the actual memory requirement on the native graph.

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc

In `ProgramManager::removeImages`, we cleanup our KernelName2KernelID mapping, before using that exact mapping retrieve KernelIDs in order to clean up our KernelIDs2BinImage mapping. This PR cleans up `m_KernelID2BinImage` mapping before cleaning up `m_KernelName2KernelIDs` maping. This is inteded to fix a hit raised by Coverity.

…-g is passed. (intel#17987) Reverts intel#16408

AerialMantis

Changes look good, thanks. Left a few additional comments.

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc

This fixes a regression after intel#17876

This patch fixes up and enable memory pools for the HIP adapter, it is based on oneapi-src/unified-runtime#1689 and on the CUDA adapter implementation. The initial patch had segmentation faults in the CI that we couldn't reproduce locally. That happened as well in this patch and I couldn't reproduce the segfaults locally either. However I noticed that it failed in `urUSMHostAlloc`, and that entry point was different from the CUDA adapter version, in that the HIP adapter was using a "helper" function. It turns out that the helper function was using a device pool instead of a host pool to do the allocation, which seemed obviously wrong. Replacing the helper by similar code used in the CUDA adapter fixes the crash in the CI.

Follow up from intel#17931, additional fix for URT-903.

…l#17850) This ensures that functions have the right linkage. Several functions are marked as used to prevent them from being removed as dead code before the work item loop pass and `PrepareSYCLNativeCPUPass` run.

- Using sycl_ext_codeplay_async_memory_alloc extension - Spec wording for graph support of the feature - Usage guide guidance for library authors - Usage guide examples of explicit and queue recording usage with and without mem pools

Bensuo · 2025-04-14T12:53:45Z

Closing in favor of upstream PR.

EwanC reviewed Mar 24, 2025

View reviewed changes

EwanC approved these changes Mar 25, 2025

View reviewed changes

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc Outdated Show resolved Hide resolved

fabiomestre reviewed Mar 27, 2025

View reviewed changes

AerialMantis reviewed Apr 2, 2025

View reviewed changes

ianayl and others added 2 commits April 11, 2025 12:55

[SYCL][Driver] Revert -O0 as the default SYCL device optimization if …

92121c9

…-g is passed. (intel#17987) Reverts intel#16408

AerialMantis reviewed Apr 13, 2025

View reviewed changes

pbalcer and others added 6 commits April 14, 2025 09:44

[CI][Benchmarks] fix compute-benchmarks compilation (intel#17993)

b1051b6

This fixes a regression after intel#17876

[UR] Retain and release device handles for sub-devices (intel#17977)

02d4a8d

Follow up from intel#17931, additional fix for URT-903.

[UR] Move native cpu device code cts skip to LoadSource. (intel#17973)

f531ef3

Bensuo force-pushed the ben/async-alloc-graphs-spec branch from e0a153d to d204eb0 Compare April 14, 2025 12:46

Bensuo closed this Apr 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][Graph] Add spec wording for graph-owned memory allocations #384

[SYCL][Graph] Add spec wording for graph-owned memory allocations #384

Uh oh!

Bensuo commented Mar 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AerialMantis Apr 2, 2025

Uh oh!

Uh oh!

AerialMantis left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Bensuo commented Apr 14, 2025

Uh oh!

Uh oh!

[SYCL][Graph] Add spec wording for graph-owned memory allocations #384

[SYCL][Graph] Add spec wording for graph-owned memory allocations #384

Uh oh!

Conversation

Bensuo commented Mar 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AerialMantis Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AerialMantis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Bensuo commented Apr 14, 2025

Uh oh!

Uh oh!