Skip to content

OpenMPI main branch 3rd-party prrte points to obsolete/bugged version #13337

@fgava90

Description

@fgava90

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

I'm compiling from main branch, this is the commit that appears in the configure log:

*** Checking versions
checking for repo version... cfb5505
checking Open MPI version... 5.1.0a1
checking Open MPI release date... Unreleased developer copy
checking Open MPI repository version... cfb5505

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

I've installed with spack v0.23
spack install openmpi@main

configure '--disable-option-checking' '--prefix=/home/gavaf/spack/opt/linux-rocky8-x86_64/gcc-11.3.0/openmpi/main-6ksrvxfwdiezibx65yoxwbkw5xuywryw' --enable-prte-ft --with-proxy-version-string=5.1.0a1 --with-proxy-package-name="Open MPI" --with-proxy-bugreport="https://www.open-mpi.org/community/help/" --disable-devel-check --enable-prte-prefix-by-default --disable-pmix-lib-checks --with-pmix-extra-libs="/cfdbuild/gavaf/spack-stage/spack-stage-openmpi-main-6ksrvxfwdiezibx65yoxwbkw5xuywryw/spack-src/3rd-party/openpmix/src/libpmix.la" '--enable-shared' '--disable-silent-rules' '--disable-sphinx' '--enable-builtin-atomics' '--disable-static' '--enable-mpi1-compatibility' '--with-ucx=/usr' '--without-psm2' '--without-xpmem' '--without-cma' '--without-ofi' '--without-verbs' '--without-mxm' '--without-psm' '--without-ucc' '--without-knem' '--without-hcoll' '--without-fca' '--without-cray-xpmem' '--without-loadleveler' '--without-lsf' '--without-alps' '--with-slurm' '--without-tm' '--without-sge' '--disable-memchecker' '--with-libevent=/home/gavaf/spack/opt/linux-rocky8-x86_64/gcc-11.3.0/libevent/2.1.12-rbkwyej3j3ubh65p3qb3fpk7feut767l' '--with-zlib=/home/gavaf/spack/opt/linux-rocky8-x86_64/gcc-11.3.0/zlib-ng/2.2.1-2juy7xtmbpubtb6al6q2h6grlpbltgxa' '--with-hwloc=/usr' '--disable-java' '--disable-mpi-java' '--disable-io-romio' '--with-gpfs=no' '--without-cuda' '--enable-wrapper-rpath' '--disable-wrapper-runpath' '--disable-debug'

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

 08e41ed5629b51832f5708181af6d89218c7a74e 3rd-party/openpmix (v1.1.3-4067-g08e41ed5)
 30cadc6746ebddd69ea42ca78b964398f782e4e3 3rd-party/prrte (psrvr-v2.0.0rc1-4839-g30cadc6746)
 6032f68dd9636b48977f59e986acc01a746593a6 3rd-party/pympistandard (remotes/origin/main-23-g6032f68)
 dfff67569fb72dbf8d73a1dcf74d091dad93f71b config/oac (dfff675)

Please describe the system on which you are running

  • Operating system/version: rockylinux 8.8
  • Computer hardware: AMD EPYC Processor
  • Network type:

Details of the problem

The pprte commit to which the 3rdParty/pprte folder points to is bugged.
When running with --map-by method:mod1,mod2 only mod1 is taken into account.
There was a long discussion in this issue #12967 about this, which stemmed froma different issue.
As @rhc54 pointed out, this has been fixed in prrte v3 (I've tested 3.0.11 myself and the issue is indeed solved there).
However from the output of the git submodule status it looks like 3rdParty points to psrvr-v2.0.0rc1-4839-g30cadc6746 which still has the bug.
Here an example.

mpirun -np 4 --verbose --bind-to hwt --map-by pe-list=0,3,128,131:ordered:hwtcpus --report-bindings --mca mca_base_verbose debug --mca rmaps_base_verbose 100  hostname

...
rmaps:base set policy with pe-list=0,3,128,131:ordered:hwtcpus
rmaps:base policy pe-list=0,3,128,131 modifiers ordered provided
rmaps:base check modifiers with ordered
mca:rmaps: mapping job prterun-mrl-pldevbld88-3796956@1
mca:rmaps: setting mapping policies for job prterun-mrl-pldevbld88-3796956@1 inherit TRUE hwtcpus FALSE

which shows that hwtcpus in this case has not been taken into account.

I think the solution would be to update the commit reference in 3rdParty/prrte to a prrte v3 commit (prrte-v4 requires also an update of pmix to v6)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions