-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The example script launches B small kernels with requisite overhead
that could be grouped into one larger kernel to reduce launch overhead.
It would be nice if there was pykokkos functionality to indicate kernel batching
while keeping the current structure of the outer for loop.
import cupy as cp
import pykokkos as pk
@pk.workunit
def work(wid, a):
a[wid] = a[wid] + 1
def main():
B = 10
N = 10
a = cp.ones((B, N))
pk.set_default_space(pk.Cuda)
for batch in range(B):
pk.parallel_for("work", 10, work, a=a[batch])
print(a)
main()Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request