Description
Please fill in the following for any issues
Your setup:
-
Operating System (Linux, MacOS, Windows): Windows
-
Hardware type (x86, ARM..) and RAM: x86 AMD Threadripper Pro 3945WX Pro , 32GB DDR4 ram
-
Python Version (e.g. 3.9): 3.9.15
-
Caiman version (e.g. 1.9.12): 1.9.12
-
Which demo exhibits the problem (if applicable): Demos run just fine w/ CPU instructions.
-
How you installed Caiman (pure conda, conda + compile, colab, ..):
mamba install -c conda-forge caiman=1.9.12=py39h2ba5b7c_0 tensorflow=2.10=gpu_py39h9bca9fa_0 "numpy<1.24"'
pip install git+https://github.com/inducer/pycuda.git
pip install git+https://github.com/lebedov/scikit-cuda.git
-
Details:
I've been working on getting CaImAn working with CUDA support for FFT calculations, but I've been running into a problem.
I'm using a NVIDIA RTX 3090 and as a result, I'm stuck on CUDA versions 11.0 or higher to support this architecture.
When running piecewise-rigid motion correction and invoking the gpu for FFT calculations, the code crashes on runningcudafft
with the errorpycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resources handle
.
Grasping at straws here since I'm not super familiar with CUDA programming.
Here's a related thread for scikit-cuda. A user mentions that CUDA 11 has a known issue where "cuFFT planning and plan estimation functions may not restore correct context affecting CUDA driver API applications".
lebedov/scikit-cuda#308
My guess is that something needs to be done about init_cuda_process
or close_cuda_process
and how contexts are initialized, destroyed, passed around, etc.
If someone here is working with NVIDIA 30 series cards and doesn't find themselves having this problem, please let me know.