Computational Physics Lab 11
Basic MPI Programming
The goal of this lab is to learn basic principles of MPI programming.
Part 1. A Simple Program in Detail
The procedure for compiling and running an MPI program was outlined during the lecture about TTU HPCC. Using this procedure, transfer to the HPCC cluster, compile, and run the following simple program: MPIhello.cpp. Ignore the "feupdateenv is not implemented" warning from the mpiCC compiler. The computers to use for this lab are the interactive nodes in the TTU HPCC antaeus cluster:
hugin.hpcc.ttu.edu
munin.hpcc.ttu.edu
If you are using the bash shell on HPCC computers, you can use the instructions in the lectures. For tcsh shell, a few things must be modified. This is how you can set up the relevant environmental variables:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/share/apps/mpi/openmpi/IB-icc-ifort-64/lib:/opt/intel/cce/9.1.044/lib
set path = ($path /share/apps/mpi/openmpi/IB-icc-ifort-64/bin)
rehash
Note
that if the script which you are submitting to the LSF batch system
starts with "#!/bin/tcsh" then you must have similar lines appended to
the .cshrc file in your home directory. If the script starts with "#!/bin/bash" then you must have a .bashrc file to which you append the contents of the bashrc-temp file mentioned in the lecture.
You should used the "Phys-Class" batch queue to run various MPI jobs. To do this, make sure that you have the line
#BSUB -q Phys-Class
in your job submission script.
When
you compile your programs with "mpiCC", it is useful to give them
different names. This can be accomplished by using the "-o" switch with
the compiler:
mpiCC MPIhello.cpp -o MPIhello
If you do this, the executable file of your program will be named MPIhello rather than a.out. You should modify your job submission script accordingly: find where a.out is mentioned in file mpi-cpp.sh and change that word to MPIhello.
If
the batch system is busy, it can take a long time to execute your
jobs. In this case it is better to run your programs on hugin and munin
interactively. These machines have 8 CPU cores each, so they are well
suited for developing parallel programs. To execute your program in the
interactive mode, run
mpirun -np 4 MPIhello
The
argument provided after the "-np" switch is the number of processors
which will be used simultaneously by your program. It is probably
best to limit yourself to 4 processors while performing interactive
development and debugging on these machines.
The contents of the MPIhello.cpp file are listed below:
#include <iostream>
// Include the standard MPI API header
#include "mpi.h"
int main(int argc, char* argv[])
{
// Initialize MPI
MPI_Init(&argc, &argv);
// Get CPU's rank
int myrank;
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
// This is what your program really does
std::cout << "Hello World from processor " << myrank << std::endl;
// Stop MPI communications
MPI_Finalize();
return 0;
}
This is how things are set up:
1) You must include the "mpi.h" header file which declares various MPI functions and defines a variety of constants understood by MPI.
2) The first MPI-related function call, MPI_Init,
allows you to modify the behavior of your MPI program using command
line arguments. This is especially useful in order to turn on various
MPI debugging printouts. For more details about this function (and many
other), look up its description at http://www-unix.mcs.anl.gov/mpi/www/ or just run "man MPI_Init" in your terminal window.
3) The int myrank statement declares the variable myrank which stands for the rank of the computer. Each processor running your program is assigned a unique number called rank in
MPI jargon. By knowing the processor rank you can tell the difference
among the identical programs running on different CPUs.
4) There are two types of CPUs: the CPU with rank 0 is called host or master, and all other machines are called guests or slaves.
Each guest gets its rank from the host in the order it registers: the
first guest to respond is assigned rank 1, the second 2, etc. Thus each CPU has its unique identifier when an MPI program runs.
5) The MPI_Comm_rank(MPI_COMM_WORLD, &myrank) call returns a different rank for each processor running the program. The first argument, called communicator,
is a predefined constant which tells MPI which grouping of processors
to communicate with. Unless one has set up groups of processors, just
use the default MPI_COMM_WORLD.
6)
When the code is executed, each processor prints its rank together with
the "Hello World" message. A typical output from your program (in the NNNNN.pgm.out file where NNNNN stands for job number) should look like this:
Hello World from processor 2
Hello World from processor 3
Hello World from processor 0
Hello World from processor 1
Note
that the ranks do not come out in order. This is an important feature
of all MPI programs: the order of execution of various statements is
undefined unless you insert special synchronization function calls in
your program.
If your program output does not look like the
example shown above, there must be a problem in either your environment
setup or in the setup of the batch node on which the program was
scheduled to run (I have seen the last problem happen). Please let me
know if things do not work.
7) The last MPI-related statement executed in an MPI program must be MPI_Finalize. This statement properly terminates all MPI communications and releases related system resources.
Part 2. Sending and Receiving Messages
Sending and receiving data is the central mechanism of parallel computing with MPI. The file MPImessage.cpp
illustrates sending and receiving messages. Please compile and execute
this program. In addition to the MPI functions explained in the
previous example, two new important functions are introduced: MPI_Send and MPI_Recv. The criteria for successfully sending and receiving a message are as follows:
1) The sender which calls the MPI_Send function must specify a valid rank for the receiver, and the processor of that rank must call MPI_Recv.
2) The receiver must specify a valid source rank (which can also be MPI_ANY_SOURCE to receive a message from any CPU).
3) The send and receive communicators (in this case, MPI_COMM_WORLD) must be the same.
4) The tags must match.
5) The receiver's message buffer must be large enough to hold the entire message.
Here is a more complicated example: MPImessage2.cpp.
In this example, every guest sends an integer (which could be a result
of some calculation) to the host computer. Please go through this
example and understand how it works. Note that the order in which the
messages are received is undefined.
When you send an MPI
message, the type of the data elements transmitted must be explicitly
specified by the "datatype" argument (this is the third argument in
both MPI_Send and MPI_Recv). Each datatype specification is a predefined constant. Some valid specifications are
MPI_CHAR
MPI_SHORT
MPI_INT
MPI_UNSIGNED
MPI_LONG
MPI_FLOAT
MPI_DOUBLE
The
correspondence between these specifications and the built-in C++ types
is pretty obvious. For example, if you want to send an array of double
precision numbers, use the MPI_DOUBLE datatype. The MPI constants page describes these and other constants in more detail.
Part 3. Writing Your Own MPI Program
At
this point you should have enough information to successfully write
your own simple MPI program. Using MPI, distribute the computations
among several nodes and build the logistic map bifurcation
diagram discussed in Part 1 of Lab 3. Instead of printing out values of (mu, xn)
into a file, you can have an array of integers x[1000] for each value
of mu. You can set x[0] to 1 if there is at least one x point in the
diagram between x values of 0 and 0.001, x[1] corresponds to x range
from 0.001 to 0.002, etc. These
arrays can be calculated on the guest CPUs for different values of mu
and then passed to the host computer which then builds the whole
diagram and saves it to disk in some form.
Please send in your lab report by email
before 2 pm 04/17/2008.
Include the program code, the program output file, and the bifurcation diagram plot which corresponds to the output file.