-
Notifications
You must be signed in to change notification settings - Fork 220
Description
Dear Relion community,
We are trying to install relion-5.0 on a workstation harboring the newly released NVIDIA GeForce RTX 5060 Ti 16GB and are running into issues with PyTorch compatibility. The machine has ubuntu 24.04.02 and has a compatible nvidia driver and cuda 12.8. If we do not install PyTorch separately before Relion installation, it results in an error when running 3D classification with Blush regularization.
Environment:
- OS: Ubuntu 24.04.02 LTS
- MPI runtime: OpenRTE 4.1.6
- RELION version: RELION-5.0.0-commit-adfec8
- Memory: 280 GB
- GPU: NVIDIA GeForce RTX 5060 Ti
Dataset:
- Box size: 60 px
- Pixel size: 2.2Å
- Number of particles: 80,000
- Description: helix
Job options:
- Type of job: Class3D
- Number of MPI processes:
- Number of threads: 1
- Full command:
`which relion_refine_mpi` --o Class3D/job305/run --i Select/job065/particles.star --ref InitialModel/job067/run_it100_class001.mrc --firstiter_cc --trust_ref_size --ini_high 15 --dont_combine_weights_via_disc --preread_images --pool 10 --pad 2 --ctf --iter 25 --tau2_fudge 4 --particle_diameter 110 --K 1 --flatten_solvent --zero_mask --strict_highres_exp 12 --blush --oversampling 1 --healpix_order 2 --offset_range 2 --offset_step 2 --sym C1 --norm --scale --helix --helical_outer_diameter 72 --helical_nr_asu 3 --helical_twist_initial 100 --helical_rise_initial 10.2 --helical_z_percentage 0.4 --helical_symmetry_search --helical_twist_min 50 --helical_twist_max 150 --helical_twist_inistep 5 --helical_rise_min 5 --helical_rise_max 20 --helical_rise_inistep 2 --helical_keep_tilt_prior_fixed --sigma_tilt 3.33333 --sigma_psi 3.33333 --sigma_rot 0 --helical_sigma_distance 0.666667 --j 3 --gpu "0,1" --pipeline_control Class3D/job305/
Error message 1:
/home/user/.conda/envs/relion-5.0/lib/python3.10/site-packages/torch/cuda/__init__.py:287:
UserWarning:
NVIDIA GeForce RTX 5060 Ti with CUDA capability sm_120 is not
compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_60
sm_70 sm_75 sm_80 sm_86 sm_90.
If you want to use the NVIDIA GeForce RTX 5060 Ti GPU with PyTorch,
please check the instructions at
https://pytorch.org/get-started/locally/
warnings.warn(
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/user/.conda/envs/relion-5.0/lib/python3.10/site-packages/relion_blush/command_line.py",
line 335, in main
class3d(
File "/home/user/.conda/envs/relion-5.0/lib/python3.10/site-packages/relion_blush/command_line.py",
line 189, in class3d
denoised_nv, _ = apply_model(
File "/home/user/.conda/envs/relion-5.0/lib/python3.10/site-packages/torch/utils/_contextlib.py",
line 116, in decorate_context
return func(*args, **kwargs)
File "/home/user/.conda/envs/relion-5.0/lib/python3.10/site-packages/relion_blush/util.py",
line 177, in apply_model
infer_grid = torch.zeros_like(volume).to(device)
RuntimeError: CUDA error: no kernel image is available for execution
on the device
CUDA kernel errors might be asynchronously reported at some other API
call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Something went wrong in the external Python call...
Command: relion_python_blush
Class3D/job305/run_it001_class001_external_reconstruct.star --gpu
0,1,
---------------------------------- PYTHON ERROR
---------------------------------
Has RELION been provided a Python interpreter with the correct environment?
The interpreter can be passed to RELION either during Cmake configuration by
using the Cmake flag -DPYTHON_EXE_PATH=<path/to/python/interpreter>.
NOTE: For some modules TORCH_HOME needs to be set to find pretrained models
---------------------------------------------------------------------------------
Using python executable: /home/user/.conda/envs/relion-5.0/bin/python
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero
status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[43628,1],1]
Exit code: 1
--------------------------------------------------------------------------
We also tried to first install nightly build of PyTorch 2.9.0.dev20250717+cu18, but received an additional error message that only PyTorch versions up to 2.7.1 were compatible for Relion installation.
Error Message 2:
ERROR: Could not find a version that satisfies the requirement
torch==2.9.0.dev20250717+cu128 (from versions: 1.11.0, 1.12.0, 1.12.1,
1.13.0, 1.13.1, 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1,
2.2.2, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.6.0, 2.7.0, 2.7.1)
ERROR: No matching distribution found for torch==2.9.0.dev20250717+cu128
Upon manually installing PyTorch 2.7.1, running the same 3D classification job as before gives us the same Error Message 1 and appears to also be incompatible with the RTX 5060. We were wondering if there are any other workarounds to properly install Relion 5.0 on this GPU? Thank you for your help!