Skip to content

CUDA 12 debs require libnvidia-compute #11140

@trxcllnt

Description

@trxcllnt

Describe the bug

The ucx-cuda deb package has a hard dependency on libnvidia-compute.

This is undesired when installing the ucx-cuda package in a container, since libnvidia-compute installs a CUDA driver which conflicts with the driver mounted by the container runtime.

Steps to Reproduce

$ wget -q -O ucx.tar.bz2 https://github.com/openucx/ucx/releases/download/v1.19.0/ucx-1.19.0-ubuntu24.04-mofed5-cuda12-x86_64.tar.bz2
$ tar -xvjf ucx.tar.bz2
$ apt -y install --no-install-recommends ucx-1.19.0.deb ucx-cuda-1.19.0.deb ucx-xpmem-1.19.0.deb
> Reading package lists...
> Building dependency tree...
> Reading state information...
> The following additional packages will be installed:
>   libnvidia-cfg1 libnvidia-common libnvidia-compute libnvidia-decode
>   libnvidia-gpucomp nvidia-persistenced
> Suggested packages:
>   nvidia-driver-pinning-590
> The following NEW packages will be installed:
>   libnvidia-cfg1 libnvidia-common libnvidia-compute libnvidia-decode
>   libnvidia-gpucomp nvidia-persistenced ucx ucx-cuda ucx-xpmem

Additional information (depending on the issue)

IIUC these lines should make the libnvidia-compute dependency optional.

It appears the string sed -i 's/libnvidia-compute | libnvidia-ml1, //g' tmp/DEBIAN/control is replacing has changed:

$ dpkg-deb -R ucx-cuda-1.19.0.deb tmp
$ grep Depends: tmp/DEBIAN/control
> Depends: libc6 (>= 2.34), libnvidia-compute | libnvidia-ml1 (>= 555.42.06), ucx

Updating the pattern with a wildcard match seems to work:

$ sed 's/libnvidia-compute | libnvidia-ml1.*, //g' tmp/DEBIAN/control | grep Depends
> Depends: libc6 (>= 2.34), ucx

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions