Skip to content

Commit 833288c

Browse files
committed
Bump default HPC-X version to 2.24
And accomodate upstream labeling changes.
1 parent 16f6ab1 commit 833288c

File tree

3 files changed

+81
-60
lines changed

3 files changed

+81
-60
lines changed

docs/building_blocks.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1540,13 +1540,19 @@ __Parameters__
15401540

15411541

15421542
- __buildlabel__: The build label assigned by Mellanox to the tarball.
1543-
For versions 2.17 and later, the default value is `cuda12`.
1544-
For version 2.16 the default value is `cuda12-gdrcopy2-nccl2.18`.
1545-
For version 2.15 the default value is `cuda12-gdrcopy2-nccl2.17`.
1546-
For version 2.14 the default value is `cuda11-gdrcopy2-nccl2.16`.
1547-
For versions 2.12 and 2.13 the default value is `cuda11-gdrcopy2-nccl2.12`.
1548-
For versions 2.10 and 2.11 the default value is `cuda11-gdrcopy2-nccl2.11`.
1549-
This value is ignored for HPC-X version 2.9 and earlier.
1543+
For version 2.24 and later, the default value is the value of
1544+
`cuda` parameter. For versions 2.17 through 2.23, the default
1545+
value is `cuda12`. For version 2.16 the default value is
1546+
`cuda12-gdrcopy2-nccl2.18`. For version 2.15 the default value is
1547+
`cuda12-gdrcopy2-nccl2.17`. For version 2.14 the default value is
1548+
`cuda11-gdrcopy2-nccl2.16`. For versions 2.12 and 2.13 the
1549+
default value is `cuda11-gdrcopy2-nccl2.12`. For versions 2.10
1550+
and 2.11 the default value is `cuda11-gdrcopy2-nccl2.11`. This
1551+
value is ignored for HPC-X version version 2.9 and earlier.
1552+
1553+
- __cuda__: The CUDA label assigned by Mellanox to the tarball. This
1554+
parameter is only recognized for version 2.24 and later. The
1555+
default value is `cuda13.`
15501556

15511557
- __environment__: Boolean flag to specify whether the environment
15521558
should be modified to include HPC-X. This option is only
@@ -1593,12 +1599,13 @@ tarball. For version 2.21 and later, the default value is
15931599
- __oslabel__: The Linux distribution label assigned by Mellanox to the
15941600
tarball. For Ubuntu, the default value is `ubuntu16.04` for
15951601
Ubuntu 16.04, `ubuntu18.04` for Ubuntu 18.04, `ubuntu20.04` for
1596-
Ubuntu 20.04, and `ubuntu22.04` for Ubuntu 22.04. For HPC-X
1597-
version 2.10 and later and RHEL-based Linux distributions, the
1598-
default value is `redhat7` for version 7 and `redhat8` for version
1599-
8. For HPC-X version 2.9 and earlier and RHEL-based Linux
1600-
distributions, the default value is `redhat7.6` for version 7 and
1601-
`redhat8.0` for version 8.
1602+
Ubuntu 20.04, `ubuntu22.04` for Ubuntu 22.04, and `ubuntu24.04`
1603+
for Ubuntu 24.04. For HPC-X version 2.10 and later and RHEL-based
1604+
Linux distributions, the default value is `redhat7` for version 7,
1605+
`redhat8` for version 8, and `redhat9` for version 9. For HPC-X
1606+
version 2.9 and earlier and RHEL-based Linux distributions, the
1607+
default value is `redhat7.6` for version 7 and `redhat8.0` for
1608+
version 8.
16021609

16031610
- __ospackages__: List of OS packages to install prior to installing
16041611
Mellanox HPC-X. For Ubuntu, the default values are `bzip2`,
@@ -1610,7 +1617,7 @@ distributions the default values are `bzip2`, `numactl-libs`,
16101617
`/usr/local/hpcx`.
16111618

16121619
- __version__: The version of Mellanox HPC-X to install. The default
1613-
value is `2.22.1`.
1620+
value is `2.24.1`.
16141621

16151622
__Examples__
16161623

hpccm/building_blocks/hpcx.py

Lines changed: 39 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -49,13 +49,19 @@ class hpcx(bb_base, hpccm.templates.envvars, hpccm.templates.ldconfig,
4949
# Parameters
5050
5151
buildlabel: The build label assigned by Mellanox to the tarball.
52-
For versions 2.17 and later, the default value is `cuda12`.
53-
For version 2.16 the default value is `cuda12-gdrcopy2-nccl2.18`.
54-
For version 2.15 the default value is `cuda12-gdrcopy2-nccl2.17`.
55-
For version 2.14 the default value is `cuda11-gdrcopy2-nccl2.16`.
56-
For versions 2.12 and 2.13 the default value is `cuda11-gdrcopy2-nccl2.12`.
57-
For versions 2.10 and 2.11 the default value is `cuda11-gdrcopy2-nccl2.11`.
58-
This value is ignored for HPC-X version 2.9 and earlier.
52+
For version 2.24 and later, the default value is the value of
53+
`cuda` parameter. For versions 2.17 through 2.23, the default
54+
value is `cuda12`. For version 2.16 the default value is
55+
`cuda12-gdrcopy2-nccl2.18`. For version 2.15 the default value is
56+
`cuda12-gdrcopy2-nccl2.17`. For version 2.14 the default value is
57+
`cuda11-gdrcopy2-nccl2.16`. For versions 2.12 and 2.13 the
58+
default value is `cuda11-gdrcopy2-nccl2.12`. For versions 2.10
59+
and 2.11 the default value is `cuda11-gdrcopy2-nccl2.11`. This
60+
value is ignored for HPC-X version version 2.9 and earlier.
61+
62+
cuda: The CUDA label assigned by Mellanox to the tarball. This
63+
parameter is only recognized for version 2.24 and later. The
64+
default value is `cuda13.`
5965
6066
environment: Boolean flag to specify whether the environment
6167
should be modified to include HPC-X. This option is only
@@ -102,12 +108,13 @@ class hpcx(bb_base, hpccm.templates.envvars, hpccm.templates.ldconfig,
102108
oslabel: The Linux distribution label assigned by Mellanox to the
103109
tarball. For Ubuntu, the default value is `ubuntu16.04` for
104110
Ubuntu 16.04, `ubuntu18.04` for Ubuntu 18.04, `ubuntu20.04` for
105-
Ubuntu 20.04, and `ubuntu22.04` for Ubuntu 22.04. For HPC-X
106-
version 2.10 and later and RHEL-based Linux distributions, the
107-
default value is `redhat7` for version 7 and `redhat8` for version
108-
8. For HPC-X version 2.9 and earlier and RHEL-based Linux
109-
distributions, the default value is `redhat7.6` for version 7 and
110-
`redhat8.0` for version 8.
111+
Ubuntu 20.04, `ubuntu22.04` for Ubuntu 22.04, and `ubuntu24.04`
112+
for Ubuntu 24.04. For HPC-X version 2.10 and later and RHEL-based
113+
Linux distributions, the default value is `redhat7` for version 7,
114+
`redhat8` for version 8, and `redhat9` for version 9. For HPC-X
115+
version 2.9 and earlier and RHEL-based Linux distributions, the
116+
default value is `redhat7.6` for version 7 and `redhat8.0` for
117+
version 8.
111118
112119
ospackages: List of OS packages to install prior to installing
113120
Mellanox HPC-X. For Ubuntu, the default values are `bzip2`,
@@ -119,7 +126,7 @@ class hpcx(bb_base, hpccm.templates.envvars, hpccm.templates.ldconfig,
119126
`/usr/local/hpcx`.
120127
121128
version: The version of Mellanox HPC-X to install. The default
122-
value is `2.22.1`.
129+
value is `2.24.1`.
123130
124131
# Examples
125132
@@ -139,6 +146,7 @@ def __init__(self, **kwargs):
139146
'https://content.mellanox.com/hpc/hpc-x')
140147
self.__bashrc = '' # Filled in by __distro()
141148
self.__buildlabel = kwargs.get('buildlabel', None)
149+
self.__cuda = kwargs.get('cuda', 'cuda13')
142150
self.__hpcxinit = kwargs.get('hpcxinit', True)
143151
self.__inbox = kwargs.get('inbox', False)
144152
self.__mlnx_ofed = kwargs.get('mlnx_ofed', None)
@@ -148,13 +156,15 @@ def __init__(self, **kwargs):
148156
self.__ospackages = kwargs.get('ospackages', []) # Filled in by _distro()
149157
self.__packages = kwargs.get('packages', [])
150158
self.__prefix = kwargs.get('prefix', '/usr/local/hpcx')
151-
self.__version = kwargs.get('version', '2.22.1')
159+
self.__version = kwargs.get('version', '2.24.1')
152160

153161
self.__commands = [] # Filled in by __setup()
154162
self.__wd = kwargs.get('wd', hpccm.config.g_wd) # working directory
155163

156164
if not self.__buildlabel:
157-
if Version(self.__version) >= Version('2.17'):
165+
if Version(self.__version) >= Version('2.24'):
166+
self.__buildlabel = self.__cuda
167+
elif Version(self.__version) >= Version('2.17'):
158168
self.__buildlabel = 'cuda12'
159169
elif Version(self.__version) >= Version('2.16'):
160170
self.__buildlabel = 'cuda12-gdrcopy2-nccl2.18'
@@ -251,16 +261,20 @@ def __setup(self):
251261
"""Construct the series of shell commands, i.e., fill in
252262
self.__commands"""
253263

254-
# For version 2.8 and earlier, the download URL has the format
255-
# MAJOR.MINOR in the path and the tarball contains
256-
# MAJOR.MINOR.REVISION, so pull apart the full version to get
257-
# the individual components.
258-
version_string = self.__version
259-
if Version(self.__version) <= Version('2.8'):
264+
version_dirstring = self.__version
265+
if Version(self.__version) >= Version('2.24'):
266+
# For version 2.24 and later, the download URL has the CUDA
267+
# version appended to the directory name.
268+
version_dirstring += '_{0}'.format(self.__cuda)
269+
elif Version(self.__version) <= Version('2.8'):
270+
# For version 2.8 and earlier, the download URL has the format
271+
# MAJOR.MINOR in the path and the tarball contains
272+
# MAJOR.MINOR.REVISION, so pull apart the full version to get
273+
# the individual components.
260274
match = re.match(r'(?P<major>\d+)\.(?P<minor>\d+)\.(?P<revision>\d+)',
261275
self.__version)
262-
version_string = '{0}.{1}'.format(match.groupdict()['major'],
263-
match.groupdict()['minor'])
276+
version_dirstring = '{0}.{1}'.format(match.groupdict()['major'],
277+
match.groupdict()['minor'])
264278

265279
if self.__inbox:
266280
# Use inbox OFED
@@ -283,7 +297,7 @@ def __setup(self):
283297
self.__version, self.__ofedlabel, self.__oslabel, self.__arch)
284298

285299
tarball = self.__label + '.tbz'
286-
url = '{0}/v{1}/{2}'.format(self.__baseurl, version_string, tarball)
300+
url = '{0}/v{1}/{2}'.format(self.__baseurl, version_dirstring, tarball)
287301

288302
# Download source from web
289303
self.__commands.append(self.download_step(url=url, directory=self.__wd))

test/test_hpcx.py

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,9 @@ def setUp(self):
3636
@docker
3737
def test_defaults_ubuntu20(self):
3838
"""Default hpcx building block"""
39-
h = hpcx()
39+
h = hpcx(version='2.21.3')
4040
self.assertEqual(str(h),
41-
r'''# Mellanox HPC-X version 2.22.1
41+
r'''# Mellanox HPC-X version 2.21.3
4242
RUN apt-get update -y && \
4343
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
4444
bzip2 \
@@ -47,12 +47,12 @@ def test_defaults_ubuntu20(self):
4747
tar \
4848
wget && \
4949
rm -rf /var/lib/apt/lists/*
50-
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.22.1/hpcx-v2.22.1-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64.tbz && \
51-
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64.tbz -C /var/tmp -j && \
52-
cp -a /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64 /usr/local/hpcx && \
50+
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.21.3/hpcx-v2.21.3-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64.tbz && \
51+
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.21.3-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64.tbz -C /var/tmp -j && \
52+
cp -a /var/tmp/hpcx-v2.21.3-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64 /usr/local/hpcx && \
5353
echo "source /usr/local/hpcx/hpcx-init-ompi.sh" >> /etc/bash.bashrc && \
5454
echo "hpcx_load" >> /etc/bash.bashrc && \
55-
rm -rf /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64.tbz /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64''')
55+
rm -rf /var/tmp/hpcx-v2.21.3-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64.tbz /var/tmp/hpcx-v2.21.3-gcc-doca_ofed-ubuntu20.04-cuda12-x86_64''')
5656

5757
@x86_64
5858
@ubuntu24
@@ -61,7 +61,7 @@ def test_defaults_ubuntu24(self):
6161
"""Default hpcx building block"""
6262
h = hpcx()
6363
self.assertEqual(str(h),
64-
r'''# Mellanox HPC-X version 2.22.1
64+
r'''# Mellanox HPC-X version 2.24.1
6565
RUN apt-get update -y && \
6666
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
6767
bzip2 \
@@ -70,12 +70,12 @@ def test_defaults_ubuntu24(self):
7070
tar \
7171
wget && \
7272
rm -rf /var/lib/apt/lists/*
73-
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.22.1/hpcx-v2.22.1-gcc-doca_ofed-ubuntu24.04-cuda12-x86_64.tbz && \
74-
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu24.04-cuda12-x86_64.tbz -C /var/tmp -j && \
75-
cp -a /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu24.04-cuda12-x86_64 /usr/local/hpcx && \
73+
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.24.1_cuda13/hpcx-v2.24.1-gcc-doca_ofed-ubuntu24.04-cuda13-x86_64.tbz && \
74+
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu24.04-cuda13-x86_64.tbz -C /var/tmp -j && \
75+
cp -a /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu24.04-cuda13-x86_64 /usr/local/hpcx && \
7676
echo "source /usr/local/hpcx/hpcx-init-ompi.sh" >> /etc/bash.bashrc && \
7777
echo "hpcx_load" >> /etc/bash.bashrc && \
78-
rm -rf /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu24.04-cuda12-x86_64.tbz /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu24.04-cuda12-x86_64''')
78+
rm -rf /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu24.04-cuda13-x86_64.tbz /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu24.04-cuda13-x86_64''')
7979

8080
@x86_64
8181
@centos
@@ -106,20 +106,20 @@ def test_defaults_centos8(self):
106106
"""Default hpcx building block"""
107107
h = hpcx()
108108
self.assertEqual(str(h),
109-
r'''# Mellanox HPC-X version 2.22.1
109+
r'''# Mellanox HPC-X version 2.24.1
110110
RUN yum install -y \
111111
bzip2 \
112112
numactl-libs \
113113
openssh-clients \
114114
tar \
115115
wget && \
116116
rm -rf /var/cache/yum/*
117-
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.22.1/hpcx-v2.22.1-gcc-doca_ofed-redhat8-cuda12-x86_64.tbz && \
118-
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-redhat8-cuda12-x86_64.tbz -C /var/tmp -j && \
119-
cp -a /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-redhat8-cuda12-x86_64 /usr/local/hpcx && \
117+
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.24.1_cuda13/hpcx-v2.24.1-gcc-doca_ofed-redhat8-cuda13-x86_64.tbz && \
118+
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-redhat8-cuda13-x86_64.tbz -C /var/tmp -j && \
119+
cp -a /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-redhat8-cuda13-x86_64 /usr/local/hpcx && \
120120
echo "source /usr/local/hpcx/hpcx-init-ompi.sh" >> /etc/bashrc && \
121121
echo "hpcx_load" >> /etc/bashrc && \
122-
rm -rf /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-redhat8-cuda12-x86_64.tbz /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-redhat8-cuda12-x86_64''')
122+
rm -rf /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-redhat8-cuda13-x86_64.tbz /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-redhat8-cuda13-x86_64''')
123123

124124
@x86_64
125125
@ubuntu
@@ -314,7 +314,7 @@ def test_runtime(self):
314314
h = hpcx()
315315
r = h.runtime()
316316
self.assertEqual(r,
317-
r'''# Mellanox HPC-X version 2.22.1
317+
r'''# Mellanox HPC-X version 2.24.1
318318
RUN apt-get update -y && \
319319
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
320320
bzip2 \
@@ -323,9 +323,9 @@ def test_runtime(self):
323323
tar \
324324
wget && \
325325
rm -rf /var/lib/apt/lists/*
326-
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.22.1/hpcx-v2.22.1-gcc-doca_ofed-ubuntu22.04-cuda12-x86_64.tbz && \
327-
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu22.04-cuda12-x86_64.tbz -C /var/tmp -j && \
328-
cp -a /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu22.04-cuda12-x86_64 /usr/local/hpcx && \
326+
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://content.mellanox.com/hpc/hpc-x/v2.24.1_cuda13/hpcx-v2.24.1-gcc-doca_ofed-ubuntu22.04-cuda13-x86_64.tbz && \
327+
mkdir -p /var/tmp && tar -x -f /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu22.04-cuda13-x86_64.tbz -C /var/tmp -j && \
328+
cp -a /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu22.04-cuda13-x86_64 /usr/local/hpcx && \
329329
echo "source /usr/local/hpcx/hpcx-init-ompi.sh" >> /etc/bash.bashrc && \
330330
echo "hpcx_load" >> /etc/bash.bashrc && \
331-
rm -rf /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu22.04-cuda12-x86_64.tbz /var/tmp/hpcx-v2.22.1-gcc-doca_ofed-ubuntu22.04-cuda12-x86_64''')
331+
rm -rf /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu22.04-cuda13-x86_64.tbz /var/tmp/hpcx-v2.24.1-gcc-doca_ofed-ubuntu22.04-cuda13-x86_64''')

0 commit comments

Comments
 (0)