CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one

## Problem
I have found an issue when using CUDA 11.1, where creating a FFT plan, using it and doing another operation (simple sum reduction), then deleting the plan, re-creating another one and doing this again ends up with a `cuFuncSetBlockShape failed: invalid resource handle`

The following minimal example can be used to reproduce the issue (needs to be done in a fresh session for reproductibility)

```python
import numpy as np
import pycuda.gpuarray as cua
import pycuda.autoinit
import skcuda.fft as cu_fft

fft_shape = (128, 128)

plan = cu_fft.Plan(fft_shape, np.complex64, np.complex64, batch=1)
a = cua.to_gpu(np.random.uniform(0,1, fft_shape).astype(np.complex64))
cu_fft.fft(a, a, plan)
tmp = cua.sum(a)

del plan

plan = cu_fft.Plan(fft_shape, np.complex64, np.complex64, batch=1)
cu_fft.fft(a, a, plan)
tmp = cua.sum(a)
```

Using the above code in a *fresh* python session always ends up with the following error:

```
---> 17 tmp = cua.sum(a)

~/dev/py38-env/lib/python3.8/site-packages/pycuda/gpuarray.py in sum(a, dtype, stream, allocator)
   1639     from pycuda.reduction import get_sum_kernel
   1640     krnl = get_sum_kernel(dtype, a.dtype)
-> 1641     return krnl(a, stream=stream, allocator=allocator)
   1642
   1643
~/dev/py38-env/lib/python3.8/site-packages/pycuda/reduction.py in __call__(self, *args, **kwargs)
    283
    284             # print block_count, seq_count, self.block_size, sz
--> 285             f((block_count, 1), (self.block_size, 1, 1), stream,
    286                     *([result.gpudata]+invocation_args+[seq_count, sz]),
    287                     **kwargs)

~/dev/py38-env/lib/python3.8/site-packages/pycuda/driver.py in function_prepared_async_call(func, grid, block, stream, *arg
s, **kwargs)
    547     def function_prepared_async_call(func, grid, block, stream, *args, **kwargs):
    548         if isinstance(block, tuple):
--> 549             func._set_block_shape(*block)    550         else:
    551             from warnings import warn

LogicError: cuFuncSetBlockShape failed: invalid resource handle
```

The error occurs during the pycuda sum reduction, but it seems triggered by the deletion of the plan and re-creation of another one, so it may be due to cuFFT.
I noted than in CUDA 11.1 the release notes indicate: "_After successfully creating a plan, cuFFT now enforces a lock on the cufftHandle. Subsequent calls to any planning function with the same cufftHandle will fail_" but I have no idea if that can be related.

## Environment

List the following info:

* OS platform: Linux (tested in power64/debian10, but also fresh X86_64 cloud machines (from vast.ai) based on https://hub.docker.com/r/nvidia/cuda/ , for example version nvidia/cuda:11.1-devel or nvidia/cuda:11.0-devel images)
* Python version: 3.8 (probably not dependent)
* CUDA version: 11.0  (with driver 455.45.01) , 11.1 (with driver 450.80.02, 455.23.05 or 455.38)
* PyCUDA version: pycuda.VERSION = (2020, 1)
* scikit-cuda version: latest git 806ee27ee68 (0.53 pip-installed also has the issue)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

Problem

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CUDA 11 error (invalid resource handle) after destroying FFT plan & using a new one #308

Description

Problem

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions