Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement work_group_static / work_group_scratch_memory #15061

Open
wants to merge 67 commits into
base: sycl
Choose a base branch
from

Conversation

Naghasan
Copy link
Contributor

@Naghasan Naghasan commented Aug 13, 2024

The patch partially implements work_group_static and update proposal.

Implemented:

  • work_group_static to handle static allocation in kernel.
  • get_dynamic_work_group_memory to handle runtime allocation, but only on CUDA

work_group_static is implemented by exposing SYCLScope(WorkGroup), allowing the class to be decorated by the attribute and uses the same mechanism during lowering to place the variable in local memory.

get_dynamic_work_group_memory uses a new builtin function, __sycl_dynamicLocalMemoryPlaceholder , which is lowered into referencing a 0 sized array GV when targeting NVPTX. The approach for SPIR will need to differ from this lowering.

UR change oneapi-src/unified-runtime#1968

Copy link
Contributor

@sommerlukas sommerlukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in jit_compiler.cpp LGTM.

Naghasan added a commit to Naghasan/unified-runtime that referenced this pull request Nov 13, 2024
intel/llvm#15061 introduces a new property work_group_scratch_memory which allow the user to set a given amount of local memory to be used.

In order to pass this information to the adaptor, the patch adds a new launch property to urEnqueueKernelLaunchCustomExp.

The patch also changes the signature of urEnqueueKernelLaunchCustomExp to add global offset in order to maintain features when using this extension.

Signed-off-by: Victor Lomuller <[email protected]>
Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add a test when the memory is requested but not used/eliminated?

sycl/source/detail/jit_compiler.cpp Outdated Show resolved Hide resolved
Naghasan added a commit to Naghasan/unified-runtime that referenced this pull request Nov 18, 2024
intel/llvm#15061 introduces a new property work_group_scratch_memory which allow the user to set a given amount of local memory to be used.

In order to pass this information to the adaptor, the patch adds a new launch property to urEnqueueKernelLaunchCustomExp.

The patch also changes the signature of urEnqueueKernelLaunchCustomExp to add global offset in order to maintain features when using this extension.

Signed-off-by: Victor Lomuller <[email protected]>
@Naghasan
Copy link
Contributor Author

Do we need to add a test when the memory is requested but not used/eliminated?

oh missed that test ... I added one where the requested scratch memory is unused in source, thanks

Naghasan added a commit to Naghasan/unified-runtime that referenced this pull request Nov 19, 2024
intel/llvm#15061 introduces a new property work_group_scratch_memory which allow the user to set a given amount of local memory to be used.

In order to pass this information to the adaptor, the patch adds a new launch property to urEnqueueKernelLaunchCustomExp.

The patch also changes the signature of urEnqueueKernelLaunchCustomExp to add global offset in order to maintain features when using this extension.

Signed-off-by: Victor Lomuller <[email protected]>
Naghasan added a commit to Naghasan/unified-runtime that referenced this pull request Nov 19, 2024
intel/llvm#15061 introduces a new property work_group_scratch_memory which allow the user to set a given amount of local memory to be used.

In order to pass this information to the adaptor, the patch adds a new launch property to urEnqueueKernelLaunchCustomExp.

The patch also changes the signature of urEnqueueKernelLaunchCustomExp to add global offset in order to maintain features when using this extension.

Signed-off-by: Victor Lomuller <[email protected]>
Naghasan added a commit to Naghasan/unified-runtime that referenced this pull request Nov 19, 2024
intel/llvm#15061 introduces a new property work_group_scratch_memory which allow the user to set a given amount of local memory to be used.

In order to pass this information to the adaptor, the patch adds a new launch property to urEnqueueKernelLaunchCustomExp.

The patch also changes the signature of urEnqueueKernelLaunchCustomExp to add global offset in order to maintain features when using this extension.

Signed-off-by: Victor Lomuller <[email protected]>
Naghasan added a commit to Naghasan/unified-runtime that referenced this pull request Nov 19, 2024
intel/llvm#15061 introduces a new property work_group_scratch_memory which allow the user to set a given amount of local memory to be used.

In order to pass this information to the adaptor, the patch adds a new launch property to urEnqueueKernelLaunchCustomExp.

The patch also changes the signature of urEnqueueKernelLaunchCustomExp to add global offset in order to maintain features when using this extension.

Signed-off-by: Victor Lomuller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.