Skip to content

FaultTolerance

Jeff Squyres edited this page Sep 2, 2014 · 3 revisions

General Description

Open MPI seeks to support both data and process fault tolerance. Data reliability and network failover fault tolerance support is in active development. Process level fault tolerance in its many flavors (e.g., Checkpoint/restart, Message Logging, etc.) is also in active development. This page in intended to provide the user community with updates as to the progress of these development efforts.

Data Reliability

To be written...

Network Failover

To be written...

Checkpoint/Restart

See Checkpoint/Restart Process Fault Tolerance for more information.

Clone this wiki locally