Skip to content

Commit b6f65dd

Browse files
Publish the cl_img_swap_ops extension specification. (#1201)
* Publish the cl_img_swap_ops extension specification. * Update extensions/cl_img_swap_ops.asciidoc Listing the initial extension version. Co-authored-by: Ben Ashbaugh <[email protected]> * Update cl_img_swap_ops.asciidoc Defining behavior as undefined for cases when the number of work-items is not evenly divisible by four and if some work-items in the block of four are inactive, defining 1-dimensional local ID as a base for grouping work-items. --------- Co-authored-by: Ben Ashbaugh <[email protected]>
1 parent b648551 commit b6f65dd

File tree

2 files changed

+136
-0
lines changed

2 files changed

+136
-0
lines changed

extensions/cl_img_swap_ops.asciidoc

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
:data-uri:
2+
:icons: font
3+
include::../config/attribs.txt[]
4+
:source-highlighter: coderay
5+
6+
= cl_img_swap_ops
7+
8+
== Name Strings
9+
10+
`cl_img_swap_ops`
11+
12+
== Contact
13+
14+
Imagination Technologies Developer Forum: +
15+
https://forums.imgtec.com/
16+
17+
Tomasz Platek, Imagination Technologies (Tomasz.Platek 'at' imgtec.com)
18+
19+
== Contributors
20+
21+
CY Cheng, Imagination Technologies. +
22+
Tomasz Platek, Imagination Technologies.
23+
24+
== Notice
25+
26+
Copyright (c) 2024 Imagination Technologies Ltd. All Rights Reserved.
27+
28+
== Status
29+
30+
Final Draft
31+
32+
== Version
33+
34+
Built On: {docdate} +
35+
Version: 1.0.0
36+
37+
== Dependencies
38+
39+
This extension is written against the OpenCL C Specification Version V3.0.16.
40+
41+
== Overview
42+
43+
This extension adds built-in functions that exercise hardware capabilities of Imagination GPU IP and expose cross work-items swap functions.
44+
45+
== New OpenCL C Feature Names
46+
47+
[source,c]
48+
----
49+
__opencl_img_swap
50+
----
51+
52+
== New OpenCL C Functions
53+
54+
Perform the swap operation:
55+
56+
[source,c]
57+
----
58+
gentype img_swap_x(gentype value);
59+
gentype img_swap_y(gentype value);
60+
----
61+
62+
== Modifications to the OpenCL C Specification
63+
64+
(Add to Table 16 - Built-in Scalar and Vector Argument Common Functions in Section 6.15.4 - Common Functions) ::
65+
+
66+
--
67+
[cols="1,2",options="header"]
68+
|====
69+
| Function | Description
70+
| gentype *img_swap_x*(gentype value)
71+
a| `img_swap_x` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block:
72+
73+
* In the first work-item, `img_swap_x` returns `value` passed as an argument in the second work-item.
74+
* In the second work-item, `img_swap_x` returns `value` passed as an argument in the first work-item.
75+
* In the third work-item, `img_swap_x` returns `value` passed as an argument in the fourth work-item.
76+
* In the fourth work-item, `img_swap_x` returns `value` passed as an argument in the third work-item.
77+
78+
The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`).
79+
80+
The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined.
81+
82+
The function must be called in all four work-items of the block; otherwise, the behaviour is undefined.
83+
84+
Requires that the `__opencl_img_swap` feature macro is defined.
85+
| gentype *img_swap_y*(gentype value)
86+
a| `img_swap_y` swaps `values` between work-items in the following way: all work-items are divided into blocks of four consecutive elements. For each block:
87+
88+
* In the first work-item, `img_swap_y` returns `value` passed as an argument in the third work-item.
89+
* In the third work-item, `img_swap_y` returns `value` passed as an argument in the first work-item.
90+
* In the second work-item, `img_swap_y` returns `value` passed as an argument in the fourth work-item.
91+
* In the fourth work-item, `img_swap_y` returns `value` passed as an argument in the second work-item.
92+
93+
The work-items are assigned into blocks based on their 1-dimensional local ID (see `get_local_linear_id`).
94+
95+
The number of work-items that make up a work-group must be evenly divisible by four; otherwise, the behaviour is undefined.
96+
97+
The function must be called in all four work-items of the block; otherwise, the behaviour is undefined.
98+
99+
Requires that the `__opencl_img_swap` feature macro is defined.
100+
|====
101+
--
102+
103+
== Coding Sample
104+
105+
This coding sample shows how to use the *img_swap_x* function:
106+
[source]
107+
----
108+
__kernel void swap() {
109+
int i = get_global_id(0);
110+
int res = img_swap_x(i);
111+
112+
printf("id: %d, res = [ %d ]\n", i, res);
113+
}
114+
----
115+
116+
Executing four work-items of this kernel in one work-group gives the following result:
117+
[source]
118+
----
119+
id: 0, res = [ 1 ]
120+
id: 1, res = [ 0 ]
121+
id: 2, res = [ 3 ]
122+
id: 3, res = [ 2 ]
123+
----
124+
125+
== Version History
126+
127+
[cols="5,15,15,70"]
128+
[grid="rows"]
129+
[options="header"]
130+
|====
131+
| Version | Date | Author | Changes
132+
| 1.0.0 | 2024-06-19 | Tomasz Platek | *Initial revision*
133+
|====
134+

extensions/extensions.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@ include::cl_img_matrix_multiply.asciidoc[]
7373
<<<
7474
include::cl_img_mem_properties.asciidoc[]
7575
<<<
76+
include::cl_img_swap_ops.asciidoc[]
77+
<<<
7678
include::cl_img_use_gralloc_ptr.asciidoc[]
7779
<<<
7880
include::cl_img_yuv_image.asciidoc[]

0 commit comments

Comments
 (0)