Runtime error at the start of training #17

lmattos-11 · 2022-05-09T18:54:04Z

Hi,

I am trying to train and evaluate the network for a small scale experiment. I have managed to get everything working up to training— setting up environment, rendering input image data, changing configurations, managing dependencies issues...

At the start of training in train.py I get the following error:

RuntimeError: cuda runtime error (8) : invalid device function at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCGeneral.cpp:405

I am running the training on a single NVIDIA GeForce GPU with 25GB memory usage. Driver Version: 510.60.02 and CUDA Version: 11.6. I suspect that this error comes from some incompatibility between my GPU CUDA version and the PyTorch and other packages versions set up by the environment provided on environment.yml

I was wondering if anyone or the authors have run into a similar issue or have any suggestions on how to manage this possible incompatibility.

I tried to install higher versions of PyTorch, torchvision, and install cudatoolkit on the environment but this lead to incompatibilities with other packages' versions specified in environment.yml. conda update --all also created dependency issues

The text was updated successfully, but these errors were encountered:

ngailapdi · 2022-09-13T09:49:23Z

Hi @lmattos-11,

Apologize for the late response. I believe the issue can be fixed by installing the correct cudatoolkit version. The cudatoolkit in environment.yml is 10.2, please update this to match yours (version 11.6).

ngailapdi closed this as completed Sep 13, 2022

ngailapdi reopened this Sep 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime error at the start of training #17

Runtime error at the start of training #17

lmattos-11 commented May 9, 2022

ngailapdi commented Sep 13, 2022

Runtime error at the start of training #17

Runtime error at the start of training #17

Comments

lmattos-11 commented May 9, 2022

ngailapdi commented Sep 13, 2022