cl_ext_alive_only_barrier #1375

pjaaskel · 2025-05-19T10:06:22Z

This extension adds a new built-in function to perform barrier synchronization across the work-group even if some of the work-items are not "alive" anymore due to having returned from the kernel.

CLAassistant · 2025-05-19T10:06:34Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

pjaaskel · 2025-05-19T10:07:41Z

Ping @kpet @bashbaug @Kerilk @karolherbst.

karolherbst · 2025-05-19T10:21:07Z

I wonder if this is necessary and if the OpenCL spec could be relaxed instead here. The OpControlBarrier SPIR-V instruction already has this definition that it only waits on active invocations (see khronos internal SPIR-V MRs 280 and 329) and I'm not aware of any hardware that behaves any differently.

pjaaskel · 2025-05-19T11:55:53Z

@karolherbst: interesting. I didn't know SPIR_V changed the barrier semantics in v1.7. I don't see the wording spelled out in the spec explicitly.

This is a pretty drastic change, which basically makes v1.7 backwards incompatible with v1.6 for targets which do not implement the "active/alive only" semantics. There could be devices we don't know of where it's (significantly) more expensive to implement. Also vectorizing WGs of kernels with such barriers on CPU/SIMD, especially on non-predicated vector ISAs induces overheads. The cases should be compile-time analyzable though.

karolherbst · 2025-05-19T12:04:22Z

@karolherbst: interesting. I didn't know SPIR_V changed the barrier semantics in v1.7. I don't see the wording spelled out in the spec explicitly.

This is a pretty drastic change, which basically makes v1.7 backwards incompatible with v1.6 for targets which do not implement the "active/alive only" semantics. There could be devices we don't know of where it's (significantly) more expensive to implement. Also vectorizing WGs of kernels with such barriers on CPU/SIMD, especially on non-predicated vector ISAs induces overheads. The cases should be compile-time analyzable though.

I doubt it's problematic for anything not being a CPU, because the threading model is just entirely different there and compares more to masked/predicated SIMD instructions. But maybe it's best to discuss this at the WG meeting and ask everybody to check if anybody sees any problems with it from a hardware perspective.

Would be a bit problematic for CPU implementations, so maybe for those it might make sense to keep it explicit.

bashbaug · 2025-05-19T22:20:48Z

Couple of thoughts and corrections:

There is no SPIR-V 1.7 (at least not yet!), and these updates were made in a revision to SPIR-V 1.6. I believe the change to the SPIR-V spec was necessary to allow for some types of Subgroup barriers, which may be placed in non-uniform control flow on some Vulkan devices.
Reading through internal SPIR-V MRs 280, it sounds like the SPIR-V spec update was supposed to be paired with an update to the client API environment specifications to keep the current behavior, but it doesn't look like that happened. There is a validation rule for Subgroup scope barriers, but not one for Workgroup scope:

In all OpenCL environments, for the Barrier Instruction OpControlBarrier, when the Scope for Execution is Subgroup, behavior is undefined unless all invocations in the sub-group execute the same dynamic instance of the instruction.
At least some of our GPUs would not be able to support this "alive only barrier", so we would not be able to simply relax the specification and allow this behavior unconditionally.

If this is correct, we should file an issue to add the right validation rule for Workgroup scope barriers in the OpenCL SPIR-V environment spec as well.

karolherbst · 2025-05-20T00:26:03Z

3. At least some of our GPUs would not be able to support this "alive only barrier", so we would not be able to simply relax the specification and allow this behavior unconditionally.

I assume it's a problem on older ones?

pjaaskel · 2025-05-20T11:49:08Z

@bashbaug thanks for the clarifications. I suggest we start with a new built-in and consider converting it to a main spec requirement in the future when there are no more relevant devices where requiring the semantics is a problem.

Having the semantics as the default barrier semantics, in case of CPUs/SIMD vectorization it would add a bit of control flow analysis to detect the cases when predication is not needed. I think it's nothing to be too worried about for the most of the cases.

How I see this used is for using it only when generating from inputs which might have the semantics in the language (HIP/CUDA). Even in those cases it makes sense to CF-analyze the kernel first to find out if it really needs the semantics.

cl_ext_alive_only_barrier

fbb7026

This extension adds a new built-in function to perform barrier synchronization across the work-group even if some of the work-items are not "alive" anymore due to having returned from the kernel.

bashbaug mentioned this pull request May 22, 2025

Document SPIR-V work-group barrier requirements (OpControlBarrier) #1376

Open

pjaaskel mentioned this pull request Aug 5, 2025

Unreachable control flow: PoCL executes unreachable instructions pocl/pocl#1971

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cl_ext_alive_only_barrier #1375

cl_ext_alive_only_barrier #1375

pjaaskel commented May 19, 2025

Uh oh!

CLAassistant commented May 19, 2025

Uh oh!

pjaaskel commented May 19, 2025

Uh oh!

karolherbst commented May 19, 2025 •

edited

Loading

Uh oh!

pjaaskel commented May 19, 2025

Uh oh!

karolherbst commented May 19, 2025

Uh oh!

bashbaug commented May 19, 2025

Uh oh!

karolherbst commented May 20, 2025

Uh oh!

pjaaskel commented May 20, 2025

Uh oh!

Uh oh!

cl_ext_alive_only_barrier #1375

Are you sure you want to change the base?

cl_ext_alive_only_barrier #1375

Conversation

pjaaskel commented May 19, 2025

Uh oh!

CLAassistant commented May 19, 2025

Uh oh!

pjaaskel commented May 19, 2025

Uh oh!

karolherbst commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pjaaskel commented May 19, 2025

Uh oh!

karolherbst commented May 19, 2025

Uh oh!

bashbaug commented May 19, 2025

Uh oh!

karolherbst commented May 20, 2025

Uh oh!

pjaaskel commented May 20, 2025

Uh oh!

Uh oh!

karolherbst commented May 19, 2025 •

edited

Loading