11
11
12
12
You should run this, not with `python`, but with `mpirun`.
13
13
Or, you can run it in a SLURM job, using `srun` (or `mpirun`).
14
- Read the README.rst for more details!
14
+ Read the README.md for more details!
15
15
"""
16
16
17
17
# Import MPI at the start, as we'll use it everywhere.
@@ -64,8 +64,8 @@ def main():
64
64
else :
65
65
return mpi_nonroot (mpi_comm )
66
66
67
- # This program has two parts: The root part and the non-root part.
68
- # The root part is executed by rank 0; the non-root part by everyone else.
67
+ # This program has two parts: The controller and the worker part.
68
+ # The controller is executed by rank 0; the workers by everyone else.
69
69
# SOME TERMINOLOGY:
70
70
# MPI World: All of the MPI processes, spawned by `mpirun` or SLURM/srun.
71
71
# MPI Size: The number of MPI slots (or SLURM tasks).
@@ -95,6 +95,10 @@ def mpi_root(mpi_comm):
95
95
Once all results are gathered, output each result (the gathered array is
96
96
sorted by MPI rank). Verify that each int returned is correct, by doing
97
97
the math (`returned int == random_number + MPI_rank`) locally.
98
+
99
+ At the end, send each worker (via a unicast message) an `int` zero. Then,
100
+ wait for everyone to be at the same point in code (via a battier). Then
101
+ we're done!
98
102
"""
99
103
100
104
# We import `random` here because we only use it here.
@@ -104,7 +108,7 @@ def mpi_root(mpi_comm):
104
108
# NOTE: The lower-case methods (like `bcast()`) take Python object, and do
105
109
# the serialization for us (yay!).
106
110
# `bast()` is blocking, in the sense that it does not return until
107
- # the data has been sent, but it is _not_ synchronizing.
111
+ # the data have been sent, but it is _not_ synchronizing.
108
112
# There's also no guarantee as to _how_ the data were conveyed.
109
113
# NOTE: In Python 3.6+, we should use `secret` instead of `random`.
110
114
random_number = random .randrange (2 ** 32 )
@@ -209,6 +213,12 @@ def mpi_nonroot(mpi_comm):
209
213
Return, via the gather process, a tuple with two items:
210
214
* The MPI "CPU Identifier" (normally a hostname)
211
215
* The calculated number, above.
216
+
217
+ Then, enter a loop: We receive a number (an `int`) from the controller. If
218
+ the number is zero, we exit the loop. Otherwise, we divide the number by
219
+ two, convert the result to an int, and send the result to the controller.
220
+
221
+ Finally, after the loop is over, we synchronize via an MPI barrier.
212
222
"""
213
223
214
224
# Get our MPI rank.
0 commit comments