Number of threads per block exceeds kernel limit on some GPUs for some setups #4034
Unanswered
ali-ramadhan
asked this question in
Helpdesk
Replies: 2 comments 3 replies
-
Yes, I have seen this. Specifically for |
Beta Was this translation helpful? Give feedback.
3 replies
-
@NoraLoose this is the "cryptic error" I was referring to re: reductions in ClimaOcean. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Has anyone else encountered errors of this kind when running simulations? Seems GPU specific, e.g. will work on an RTX 4090 and A100 but not a V100 or an H100. Also may be correlated with simulation/setup/type complexity.
The stacktraces seem to point to a call to
maximum
of aField
(in progress monitoring simulation callbacks I think) but due to the asynchronous nature of CUDA.jl I don't know if that's the actual kernel at fault.I've been struggling to get a MWE and want to dedicate some time to chasing this issue, but I don't have a concrete issue yet so I thought I'd open a discussion to see if anyone else has experienced this.
Beta Was this translation helpful? Give feedback.
All reactions