Skip to content

Add vec16 types for short types #1402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 2, 2025

Conversation

Maetveis
Copy link
Contributor

@Maetveis Maetveis commented Jul 1, 2025

The cl_intel_subgroups_short extension added support for 16-wide vectors via #906. Update cl_intel_subgroup_local_block_io as well, to explicitly document that 16-wide vectors are supported for pointers to the local address space, also.

Only raise the minor version, treating this a doc clarification only. The Intel implementation already supports 16-wide vectors for the local address space.

Match cl_intel_subgroups_short version 1.1 and add vec16 types for block reads and writes for shorts.
@CLAassistant
Copy link

CLAassistant commented Jul 1, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this!

Since I was confused for a bit: the cl_intel_subgroups_short extension added support for 16-wide vectors via #906. This PR updates cl_intel_subgroup_local_block_io as well, to explicitly document that 16-wide vectors are supported for pointers to the local address space, also.

I'm wondering if we need to explicitly rev the cl_intel_subgroup_local_block_io extension to add this support, though, based on:

This extension extends the subgroup block read and write functions defined by cl_intel_subgroups (and, when supported, cl_intel_subgroups_char, cl_intel_subgroups_short, and cl_intel_subgroups_long) to support reading from and writing to pointers to the __local memory address space in addition to pointers to the __global memory address space.

It also looks like the commit that added support for 16-wide vectors added it for both the global address space and the local address space.

If so, perhaps:

  1. We can keep this extension (cl_intel_subgroup_local_block_io) as version 1.0, perhaps revving only the patch version (so 1.0.1).
  2. Clarify that "For extension version 1.1 or newer" is referring to the cl_intel_subgroups_short extension and not this extension.

Then, drivers do not need to be updated to safely access this functionality.

What do you think?

@Maetveis
Copy link
Contributor Author

Maetveis commented Jul 2, 2025

Thanks for updating this!

Since I was confused for a bit: the cl_intel_subgroups_short extension added support for 16-wide vectors via #906. This PR updates cl_intel_subgroup_local_block_io as well, to explicitly document that 16-wide vectors are supported for pointers to the local address space, also.

Sorry for being so brief on the description, I wanted to get this posted yesterday, but didn't have the time to properly explain.

I'm wondering if we need to explicitly rev the cl_intel_subgroup_local_block_io extension to add this support, though, based on:

This extension extends the subgroup block read and write functions defined by cl_intel_subgroups (and, when supported, cl_intel_subgroups_char, cl_intel_subgroups_short, and cl_intel_subgroups_long) to support reading from and writing to pointers to the __local memory address space in addition to pointers to the __global memory address space.

It also looks like the commit that added support for 16-wide vectors added it for both the global address space and the local address space.

If so, perhaps:

  1. We can keep this extension (cl_intel_subgroup_local_block_io) as version 1.0, perhaps revving only the patch version (so 1.0.1).
  2. Clarify that "For extension version 1.1 or newer" is referring to the cl_intel_subgroups_short extension and not this extension.

Then, drivers do not need to be updated to safely access this functionality.

What do you think?

Sounds good to me. For context, I was looking into adding the declarations of these functions to the clang builtin header opencl-c.h and that's when I spotted that the __local 16 wide variants are not explicitly documented.

@Maetveis Maetveis requested a review from bashbaug July 2, 2025 07:01
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Just FYI, no action required: I updated the copyright date and the version in the document header also, consistent with the change to the revision history table.

@bashbaug bashbaug merged commit f05cec6 into KhronosGroup:main Jul 2, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants