Description
So far, we have done MPI-parallel testing in a really cumbersome way: pytest would run tests in serial, which launch a parallel execution using subprocess
. This is slow, messy to write and gives useless error messages.
There are MPI extensions for pytest. None of them are perfect, but some of them are significantly better than what we do now. I suggest we add one of them to the pipeline and transition to using them in the future, although I am not 100% which is the best at this time.
The Firedrake solution
The Firedrake people develop mpi-pytest. This works very simply:
@pytest.mark.parallel(nprocs=[1, 2, 3]) # run in parallel on 1, 2 and 3 processes
def test_my_code_on_variable_nprocs():
...
The downside is that it doesn't use the Python assert
statement, but a separate function. On the plus side, this communicates the result of the assertion between tasks, but it requires a different usage from normal. Normal assert is still available, but may lead to deadlocks.
The DLR solution
The DLR people develop pytest-isolate-mpi. This seems a bit more mature, but also more cumbersome. A test is written with
@pytest.mark.mpi(ranks=[1, 2, 3])
def test_number_of_processes_matches_ranks(mpi_ranks, comm):
...
Note the mpi_ranks
argument, which every function now has to have. comm
is a pytest fixture here that is optional as an argument, I believe. I am not actually sure how deadlocks following asserts are handled here. There is no special assertion pytest-isolate-mpi. But there is easy handling of timeouts to deal with deadlocks and the reports contain information about on which tasks the test has failed.
Choosing an option
Feel free to add more options or opinions in the comments. I am not clear on all pros and cons and I am not fully decided which I prefer. Both of the above options are pip installable and neither requires exotic dependencies.