Skip to content

RuntimeError: CUDA error: device-side assert triggered #66

Open
@lhp121

Description

@lhp121

I checked some comments on the issues, the commit for neuralsim is d0bee89, and for nr3d_lib it's b3d9627. My computer runs Ubuntu 22.04, and my graphics card is the NVIDIA RTX 4000 Ada. I saw your @ventusff reply under other comments, and I downloaded CUDA version 11.7 and torch version 2.0.0. I no longer have the issue with "If capturable=True, params and state_steps must be CUDA tensors."
However, I encountered " RuntimeError: CUDA error: device-side assert triggered. Compile with `TORCH_USE_CUDA_DSA' to enable device-side assertions. “ After making some modifications, the training still couldn't be completed, and it would throw a ”RuntimeError: CUDA error: device-side assert triggered。“ at 2% or 3%. Is it because the code version installed is too old, or what? Or has anyone else encountered this issue that we could discuss?

我看issues的一些评论,neuralsim的commit是d0bee89, nr3d_lib的commit是b3d9627。我电脑的版本是ubuntu22.04,显卡是NVIDIA RTX 4000 Ada,看到了您@ventusff在其他评论下的回复,我下载的cuda版本为11.7,torch版本为2.0.0,没有了If capturable=True, params and state_steps must be CUDA tensors.的问题。遇到了RuntimeError: CUDA error: device-side assert triggered。Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. 的问题,做了一些修改,但是训练一直不能完成,到2%或者3%就会报RuntimeError: CUDA error: device-side assert triggered 的错误。是代码版本安装的比较老吗还是什么? 或者谁有遇到过这个问题可以探讨一下。
jietu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions