Releases: pmodels/mpich
v4.3.2
Changes in 4.3.2
-
Improve libfabric provider selection when available providers have
negative internal score -
Improve error messages when Level Zero failures are detected
-
Improve localhost detection in Hydra
-
Update libfabric usage to silence deprecation warnings
-
Update yaksa for improved reproducibility in code generation
-
Update embedded UCX to v1.19.0
-
Update embedded libfabric to fix build issue with GCC 15
-
Add compatibility with CUDA 13
-
Fix missing const in nondestructive request test and status query
functions -
Fix HCOLL support
-
Fix crash with GPU-aware build when running on systems with no GPUs
-
Fix HIP device query
-
Fix singleton init with Hydra
-
Fix thread safety for Level Zero memcpy functions
-
Fix potential crash when release gather collective initialization fails
-
Fix ch3 connect/accept protocol handling when a discard event arrives
after a connection is already established -
Fix inlining for posix eager modules in ch4/shm
-
Fix weak attribute usage in MPI ABI build
-
Fix potential use-after-free bug in Hydra during spawn operations
-
Fix bug in persistent bcast algorithm
-
Fix bug in fabric coordinate retrieval with PMIx
-
Fix integer overflow and signed/unsigned bugs in ROMIO
-
Fix Quobtye ROMIO driver build error
-
Fix broken string conversion in mpi_f08 module
-
Fix compilation issue with mpi_f08 and NAG compiler
-
Fixes for various test program bugs
v4.3.2rc2
Changes in 4.3.2
-
Improve libfabric provider selection when available providers have
negative internal score -
Improve error messages when Level Zero failures are detected
-
Improve localhost detection in Hydra
-
Update libfabric usage to silence deprecation warnings
-
Update yaksa for improved reproducibility in code generation
-
Update embedded UCX to v1.19.0
-
Update embedded libfabric to fix build issue with GCC 15
-
Add compatibility with CUDA 13
-
Fix missing const in nondestructive request test and status query
functions -
Fix HCOLL support
-
Fix crash with GPU-aware build when running on systems with no GPUs
-
Fix HIP device query
-
Fix singleton init with Hydra
-
Fix thread safety for Level Zero memcpy functions
-
Fix potential crash when release gather collective initialization fails
-
Fix ch3 connect/accept protocol handling when a discard event arrives
after a connection is already established -
Fix inlining for posix eager modules in ch4/shm
-
Fix weak attribute usage in MPI ABI build
-
Fix potential use-after-free bug in Hydra during spawn operations
-
Fix bug in persistent bcast algorithm
-
Fix bug in fabric coordinate retrieval with PMIx
-
Fix integer overflow and signed/unsigned bugs in ROMIO
-
Fix Quobtye ROMIO driver build error
-
Fix broken string conversion in mpi_f08 module
-
Fix compilation issue with mpi_f08 and NAG compiler
-
Fixes for various test program bugs
v4.3.2rc1
Changes in 4.3.2
-
Improve libfabric provider selection when available providers have
negative internal score -
Improve error messages when Level Zero failures are detected
-
Improve localhost detection in Hydra
-
Update libfabric usage to silence deprecation warnings
-
Update yaksa for improved reproducibility in code generation
-
Update embedded UCX to v1.19.0
-
Update embedded libfabric to fix build issue with GCC 15
-
Add compatibility with CUDA 13
-
Fix missing const in nondestructive request test and status query
functions -
Fix HCOLL support
-
Fix crash with GPU-aware build when running on systems with no GPUs
-
Fix singleton init with Hydra
-
Fix thread safety for Level Zero memcpy functions
-
Fix potential crash when release gather collective initialization fails
-
Fix ch3 connect/accept protocol handling when a discard event arrives
after a connection is already established -
Fix inlining for posix eager modules in ch4/shm
-
Fix weak attribute usage in MPI ABI build
-
Fix potential use-after-free bug in Hydra during spawn operations
-
Fix bug in persistent bcast algorithm
-
Fix bug in fabric coordinate retrieval with PMIx
-
Fix integer overflow and signed/unsigned bugs in ROMIO
-
Fix Quobtye ROMIO driver build error
-
Fix broken string conversion in mpi_f08 module
-
Fix compilation issue with mpi_f08 and NAG compiler
-
Fixes for various test program bugs
v4.3.1
Changes in 4.3.1
-
Fix initialization in GPU-aware builds when no devices are present
-
Fix internal pmix.h header conflict when building with an external
PMIx library. -
Fix build issue with Slurm by removing dependency on libslurm and
always using internal logic for parsing the Slurm hostfile. -
Fix potential stale GPU IPC handle usage resulting in data corruption
or crashes -
Update XPMEM thresholds to avoid excessive buffer mapping overhead
-
Fix potential hang in ROMIO when setting info hints on certain files
-
Improved detection of incompatible PMI[x] client/server configuration
-
Fix use of PMIX_PREFIX attribute for certain versions of OpenPMIx
-
Fix Intel GPU output with MPIR_CVAR_DEBUG_SUMMARY
-
Fix F08 binding compilation with nvfortran
-
Fix line continuations with Hydra's --configfile option
-
Fix valgrind uninitialized read warnings in ch3
-
Fix missing mpixxx_opts.conf file with help text for mpicc and friends
-
Fixes for several compiler warnings
v4.3.1rc1
Changes in 4.3.1
-
Fix initialization in GPU-aware builds when no devices are present
-
Fix internal pmix.h header conflict when building with an external
PMIx library. -
Fix build issue with Slurm by removing dependency on libslurm and
always using internal logic for parsing the Slurm hostfile. -
Fix potential stale GPU IPC handle usage resulting in data corruption
or crashes -
Update XPMEM thresholds to avoid excessive buffer mapping overhead
-
Fix potential hang in ROMIO when setting info hints on certain files
-
Improved detection of incompatible PMI[x] client/server configuration
-
Fix use of PMIX_PREFIX attribute for certain versions of OpenPMIx
-
Fix Intel GPU output with MPIR_CVAR_DEBUG_SUMMARY
-
Fix F08 binding compilation with nvfortran
-
Fix line continuations with Hydra's --configfile option
-
Fix valgrind uninitialized read warnings in ch3
-
Fix missing mpixxx_opts.conf file with help text for mpicc and friends
-
Fixes for several compiler warnings
v4.3.0
Changes in 4.3.0
-
Support MPI memory allocation kinds side document.
-
Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
mpicc_abi. By default, mpicc still builds and links with MPICH ABI. -
Experimental API MPIX_Op_create_x. It supports user callback function with
extra_state context and op destructor callback. It supports language bindings
to use proxy function for language-specific user callbacks. -
Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
user error handlers to have extra_state context and corresponding destructor.
This allows language bindings to implement user error handlers via proxy. -
Experimental API MPIX_Request_is_complete. This is a pure request state query
function that will not invoke progress, nor will free the request. This should
help applications that want separate task dependency checking from progress
engine to avoid progress contentions, especially in multi-threaded context.
It is also useful for tools to profile non-deterministic calls such as
MPI_Test. -
Experimental API MPIX_Async_start. This function let applications to inject
progress hooks to MPI progress. It allows application to implement custom
asynchronous operations that will be progressed by MPI. It avoids having to
implement separate progress mechanisms that may either take additional
resource or contend with MPI progress and negatively impact performance. It
also allows applications to create custom MPI operations, such as MPI
nonblocking collectives, and achieve near native performance. -
Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
-
Added CMA support in CH4 IPC.
-
Added IPC read algorithm for intranode Allgather and Allgatherv.
-
Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
for inter-numa shm communication. -
Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
-
ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
of 256. One can work around by use an info hint "port_name_size" and use a
larger port name buffer. -
PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
name does not fit in "port_name_size", it will return a truncation error. -
Autogen default to use -yaksa-depth=2.
-
Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
-
Added ch4 netmod API am_tag_send and am_tag_recv.
-
Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
-
Make check target will run ROMIO tests.
-
Add back handle conversion macros (f2c/c2f) to preserve ABI
compatibility with older MPICH libraries -
Fix compilation issue with g++ in -std=gnu++20 mode
-
Fix bug in MPI_ANY_SOURCE handling observed using the libfabric CXI
provider -
Add NIC information to error messages in ch4:ofi netmod
v4.3.0rc4
Changes in 4.3
-
Support MPI memory allocation kinds side document.
-
Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
mpicc_abi. By default, mpicc still builds and links with MPICH ABI. -
Experimental API MPIX_Op_create_x. It supports user callback function with
extra_state context and op destructor callback. It supports language bindings
to use proxy function for language-specific user callbacks. -
Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
user error handlers to have extra_state context and corresponding destructor.
This allows language bindings to implement user error handlers via proxy. -
Experimental API MPIX_Request_is_complete. This is a pure request state query
function that will not invoke progress, nor will free the request. This should
help applications that want separate task dependency checking from progress
engine to avoid progress contentions, especially in multi-threaded context.
It is also useful for tools to profile non-deterministic calls such as
MPI_Test. -
Experimental API MPIX_Async_start. This function let applications to inject
progress hooks to MPI progress. It allows application to implement custom
asynchronous operations that will be progressed by MPI. It avoids having to
implement separate progress mechanisms that may either take additional
resource or contend with MPI progress and negatively impact performance. It
also allows applications to create custom MPI operations, such as MPI
nonblocking collectives, and achieve near native performance. -
Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
-
Added CMA support in CH4 IPC.
-
Added IPC read algorithm for intranode Allgather and Allgatherv.
-
Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
for inter-numa shm communication. -
Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
-
ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
of 256. One can work around by use an info hint "port_name_size" and use a
larger port name buffer. -
PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
name does not fit in "port_name_size", it will return a truncation error. -
Autogen default to use -yaksa-depth=2.
-
Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
-
Added ch4 netmod API am_tag_send and am_tag_recv.
-
Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
-
Make check target will run ROMIO tests.
-
Add back handle conversion macros (f2c/c2f) to preserve ABI
compatibility with older MPICH libraries -
Fix compilation issue with g++ in -std=gnu++20 mode
-
Fix bug in MPI_ANY_SOURCE handling observed using the libfabric CXI
provider -
Add NIC information to error messages in ch4:ofi netmod
v4.3.0rc3
Changes in 4.3
-
Support MPI memory allocation kinds side document.
-
Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
mpicc_abi. By default, mpicc still builds and links with MPICH ABI. -
Experimental API MPIX_Op_create_x. It supports user callback function with
extra_state context and op destructor callback. It supports language bindings
to use proxy function for language-specific user callbacks. -
Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
user error handlers to have extra_state context and corresponding destructor.
This allows language bindings to implement user error handlers via proxy. -
Experimental API MPIX_Request_is_complete. This is a pure request state query
function that will not invoke progress, nor will free the request. This should
help applications that want separate task dependency checking from progress
engine to avoid progress contentions, especially in multi-threaded context.
It is also useful for tools to profile non-deterministic calls such as
MPI_Test. -
Experimental API MPIX_Async_start. This function let applications to inject
progress hooks to MPI progress. It allows application to implement custom
asynchronous operations that will be progressed by MPI. It avoids having to
implement separate progress mechanisms that may either take additional
resource or contend with MPI progress and negatively impact performance. It
also allows applications to create custom MPI operations, such as MPI
nonblocking collectives, and achieve near native performance. -
Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
-
Added CMA support in CH4 IPC.
-
Added IPC read algorithm for intranode Allgather and Allgatherv.
-
Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
for inter-numa shm communication. -
Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
-
ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
of 256. One can work around by use an info hint "port_name_size" and use a
larger port name buffer. -
PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
name does not fit in "port_name_size", it will return a truncation error. -
Autogen default to use -yaksa-depth=2.
-
Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
-
Added ch4 netmod API am_tag_send and am_tag_recv.
-
Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
-
Make check target will run ROMIO tests.
-
Add back handle conversion macros (f2c/c2f) to preserve ABI
compatibility with older MPICH libraries -
Fix compilation issue with g++ in -std=gnu++20 mode
-
Fix bug in MPI_ANY_SOURCE handling observed using the libfabric CXI
provider -
Add NIC information to error messages in ch4:ofi netmod
v4.3.0rc2
Changes in 4.3
- Support MPI memory allocation kinds side document.
- Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
mpicc_abi. By default, mpicc still builds and links with MPICH ABI. - Experimental API MPIX_Op_create_x. It supports user callback function with
extra_state context and op destructor callback. It supports language bindings
to use proxy function for language-specific user callbacks. - Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
user error handlers to have extra_state context and corresponding destructor.
This allows language bindings to implement user error handlers via proxy. - Experimental API MPIX_Request_is_complete. This is a pure request state query
function that will not invoke progress, nor will free the request. This should
help applications that want separate task dependency checking from progress
engine to avoid progress contentions, especially in multi-threaded context.
It is also useful for tools to profile non-deterministic calls such as
MPI_Test. - Experimental API MPIX_Async_start. This function let applications to inject
progress hooks to MPI progress. It allows application to implement custom
asynchronous operations that will be progressed by MPI. It avoids having to
implement separate progress mechanisms that may either take additional
resource or contend with MPI progress and negatively impact performance. It
also allows applications to create custom MPI operations, such as MPI
nonblocking collectives, and achieve near native performance. - Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
- Added CMA support in CH4 IPC.
- Added IPC read algorithm for intranode Allgather and Allgatherv.
- Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
for inter-numa shm communication. - Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
- ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
of 256. One can work around by use an info hint "port_name_size" and use a
larger port name buffer. - PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
name does not fit in "port_name_size", it will return a truncation error. - Autogen default to use -yaksa-depth=2.
- Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
- Added ch4 netmod API am_tag_send and am_tag_recv.
- Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
- Make check target will run ROMIO tests.
- Add back handle conversion macros (f2c/c2f) to preserve ABI
compatibility with older MPICH libraries - Fix compilation issue with g++ in -std=gnu++20 mode
v4.3.0rc1
Changes in 4.3
- Support MPI memory allocation kinds side document.
- Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
mpicc_abi. By default, mpicc still builds and links with MPICH ABI. - Experimental API MPIX_Op_create_x. It supports user callback function with
extra_state context and op destructor callback. It supports language bindings
to use proxy function for language-specific user callbacks. - Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
user error handlers to have extra_state context and corresponding destructor.
This allows language bindings to implement user error handlers via proxy. - Experimental API MPIX_Request_is_complete. This is a pure request state query
function that will not invoke progress, nor will free the request. This should
help applications that want separate task dependency checking from progress
engine to avoid progress contentions, especially in multi-threaded context.
It is also useful for tools to profile non-deterministic calls such as
MPI_Test. - Experimental API MPIX_Async_start. This function let applications to inject
progress hooks to MPI progress. It allows application to implement custom
asynchronous operations that will be progressed by MPI. It avoids having to
implement separate progress mechanisms that may either take additional
resource or contend with MPI progress and negatively impact performance. It
also allows applications to create custom MPI operations, such as MPI
nonblocking collectives, and achieve near native performance. - Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
- Added CMA support in CH4 IPC.
- Added IPC read algorithm for intranode Allgather and Allgatherv.
- Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
for inter-numa shm communication. - Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
- ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
of 256. One can work around by use an info hint "port_name_size" and use a
larger port name buffer. - PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
name does not fit in "port_name_size", it will return a truncation error. - Autogen default to use -yaksa-depth=2.
- Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
- Added ch4 netmod API am_tag_send and am_tag_recv.
- Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
- Make check target will run ROMIO tests.