Computational Physics Lab 12
MPI Performance Study
The goal of this lab is to practice parallel programming with MPI and
to develop some intuition about code speedups which can be achieved by
running a parallel version of an algorithm.
1) Write a parallel version of the 1-d wave equation solver (you have already developed a serial version in Lab 6).
To simplify things, assume that the number of points in the spatial
grid is known in advance (it can be made a compile-time constant), and
that the string ends are fixed. Also assume that the initial conditions
can be generated by a function, so that you can just call this function
on each node at the beginning of your program.
If you have N
computing nodes executing the program, split the string into N equal
sections and make each node responsible for propagating the string
displacement in its own section. Each node has enough information to
process its section with the exception of the leftmost and the
rightmost points. The current value of the string displacement to the
left of the leftmost point should be received from the node processing
the neighbor section on the left. Similarly, to propagate the rightmost
point, one needs to gather the information from the node processing the
section to the right. Sections 0 and N-1 are special: the nodes
processing them will have to talk to one neighbor only.
Since the information is exchanged between the neighbor nodes, you might find the MPI_Sendrecv
command useful. This command can replace a sequence in which a node
sends some data to another node and then receives data from the same
node.
After processing a number of time steps (choose the
number of steps so that your program runs for at least a few seconds),
make all guest nodes send their results to the host node and write
the displacement of the whole string into a file. Visualize the
solution. Use your Lab 6 code to make sure that your parallel code
produces the correct result. This, in fact, is a very common way to
develop parallel programs -- serial code is written first, and
then parallelized while making sure that the results remain the same
for various test cases.
2) Build a speedup
plot of your program. This is the graph of computation time divided by
the time for one processor, versus the number of processors. The
program timing can be obtained using the MPI_Wtime function. Do this for several different numbers of spatial points in the grid: 104, 105, 106. Using the Amdahl's law, estimate the parallel fraction of your program.
Please send in your lab report by email
before 2 pm 04/29/2008.
Include the program code, the speedup plot, and your conclusions.