Parallel Programming in the Geosciences

a short course in parallel programming techniques for

Geography, Geology and Geophysics


Goals

The primary goal of this short course is to provide a brief introduction to parallel computing techniques in the geosciences. We hope the course will be practical, in the sense that participants will:

Emphasis throughout the short course will be placed on practical solutions to computational problems in the geosciences. If at the end of the course participants are able to visualize parallel solutions to their specific computational problems, then we will consider the course a success!


Crucial Info!


Structure of the Short Course

We anticipate that course participants will have varied computing experience and we will work hard to accommodate all levels. We have elected to divide the short course into two parts. The first day will be devoted to an introduction and overview of parallel programming methods. It is open to all with any level of computing experience. The second day will be devoted to more advanced topics and is specifically geared toward those who have prior experience in computing and numerical simulation. Day 2 will concentrate on application of parallel programming techniques in finite differences and related problems.

Day 1 (Morning)

Day 1 (Afternoon)

Day 2 (Morning)

Day 2 (Afternoon)


What is Parallel Programming?

Parallel programming is the art of writing computer programs that use more than one computer processor simultaneously to solve a problem.

Why a Parallel Programming Short Course for Geoscientists?

Parallel computing methods are increasingly used to solve computationally intensive (or expensive) problems. In the Geosciences, these often involve evaluation of mathematical models that are derived from conceptual models based on observations of the natural world. Three areas where parallel processing techniques are readily used are: Stochastic Parameter Sampling, Optimization, and Numerical Simulation.

  • Stochastic Parameter Sampling. Consider a typical situation in the geosciences: a large number of poorly understood parameters contibute to a highly nonlinear problem. For example, forecasting tephra (volcanic ash) dispersion and accumulation depends on a host of input parameters about eruption and meteorological conditions. Appropriate values of each of these parameters are often poorly known but can be randomly sampled from probability density functions (which are perhaps also poorly known!). By drawing many sets of parameters, a full range of solutions is found. This range might be cast probabilistically in a hazard assessment, or might be used to guide further experiments/investigations. Clearly, the more simulations that are done, the better the full range of possible outcomes is understood. Such a problem is inherently parallel (sometimes this type of problem is called embarrassingly parallel!) because the exact same computations are performed many times on different input data (the stochastically sampled parameter values).

  • Optimization. In the geosciences, increasing attention is being paid to the goodness-of-fit between observations and realizations of our mathematical approximations of reality. How well can we infer geologically important parameters from our observations? Parallel techniques provide a powerful platform for optimization and/or inversion of data. Consider a classic example, modeling the magnetic field produced by some unknown geologic structure. Improvement of goodness-of-fit involves minimizing the difference between an observed map of variation in the total magnetic field, and a map of calculated total magnetic field values, based on a model of subsurface variation in magnetic properties (magnetic susceptibility or remanent magnetization). One approach is to linearize the problem by making the solution only depend on the magnetization of geometrically uniform bodies (voxels) located in the subsurface. Another appproach is to use optimization methods (e.g., simplex, simulated annealing) to improve goodness-of-fit in nonlinear models (e.g., with complex subsurface geometries). Both techniques (linear and nonlinear) can exploit parallel processing to greatly speed the compuation. In the case of magnetics, this speed up is sufficient to make 3D inversion practical in many cases.

  • Numerical Simulation. As in other fields, numerical simulation of dynamic phenomena has become an indepensible tool in the geosciences. Numerical simulations are applied to phenomena ranging from heat transfer, to geophysical flows (ice, magma), to seismic wave propagation. A typical path is to develop increasingly complex numerical simulations that incorporate "physical realism" or at least versions of reality. Such models become computationally intensive very quickly. For example, a finite difference approximation might be used to simulate flow in a karst aquifer. The model might use the Navier-Stokes equations to simulate conduit flow in the aquifer. As the geometric complexity of the solution increases (as it must to simulate actual geology), the length of time required to solve the problem increases dramatically. This problem, like most fluid flow problems, can be parallelized by dividing the finite difference grid into a large number of smaller grids, each solving some fraction of the total problem. Each part of the grid may be solved on a different computer processor, in parallel, as long as the boundary conditons between each of the small grids are effectively communicated between processors.

    Of course many problems will contain elements of each of the above. It is reasonable to cast parameter variation in a karst aquifer probabilistically, and calculate a large range of bulk transmissivities by running the numerical simulation many times with stochastically sampled parameter values. It may be possible to identify an optimal set of parameters to explain transmissivities observed in an actual karst aquifer. Real problems in the geosciences can be "made parallel" at a number of levels.

    Note: Once you are in this loop there is no escape!


    What is a Beowulf Cluster?

    No they don't dance...

    A beowulf cluster is a networked set of computers dedicated to parallel computations. This Beowulf network usually consists of "off the shelf" personal computers connected by an ethernet. Clusters usually run the Linux operating system (a freely available version of UNIX) and use the programming tool MPI (Message Passing Interface) for splitting up the work to be done, compiling, and running this work, in parallel, on each computer in the cluster. The first beowulf cluster was constructed in 1994 at the Goddard Space Flight Center for modeling "grand" problems in the Earth and Space Sciences. Thomas Sterling and Don Becker built the cluster and called it beowulf - for better or worse the name stuck.

    Smooth communication is the key factor that transforms a "bunch of computers" into a beowulf cluster. In a typical cluster, each computer contains an ethernet card and communicates with the other computers via a switch. In this configuration, each computer becomes a node on a private network. One computer contains two ethernet cards to connect the private cluster to the public Internet. This computer, known as the master node, coordinates communication between the other nodes, called slaves. Normally the slave nodes do not have peripheral components (e.g., monitor, keyboard, mouse).The master node functions as both a member of the cluster and as the network server/coordinator for the cluster. Users run their MPI programs on the master node, which in turn splits up the work among the various slave nodes, waits for each slave to compute a solution, and then combines each partial-solution into a final answer for the user. Files are shared among the cluster nodes via NFS (Network File System). Each of the slave nodes "mount", via NFS, certain directories located on the master node (e.g. /home /usr/local) and use the files located in these directories as their own. In this way, each node can run a single program compiled by the user on the master node.

    The idea of using affordable and scalable hardware components for parallel computing has become very popular because it makes "high-end" computing "highly" affordable. Beowulf clusters do not need to be large. In 1999, we (CC and LC) built a beowulf cluster consisting of 4 computers, each containing a 500 MHz processor. With an efficient parallel application, this Beowulf cluster of 4, 500 MHz nodes ran at the equivalent of about 2 GHz, not bad for the time.

    Of course, each node can be upgraded with new cpu's, network cards, etc, as these inevitably improve. This points out another advantage of the Beowulf paradigm. Todays highly sophisticated and proprietary "supercomputer" invariably becomes tomorrows boat anchor. While such advancement is fantastic, it is costly. Beowulf clusters exploit commodity parts, rendering them both scalable in terms of the number of computer nodes, and scalable in terms of their ease of upgrade.

    Lots of information is available. Use www.beowulf.org as a gateway to the best sites.

    If you are interested in building a beowulf cluster, see:

    http://www.acm.org/crossroads/xrds6-1/parallel.html"

    Scientific American article


    Logging on to tuya

    The Earth Sciences / Geography beowulf cluster at the University of Bristol is called tuya. A tuya is a landform created by volcano-glacial interaction. It also means "yours" in spanish, which is appropriate enough. Log on to tuya using secure shell, e.g.,


    ssh username@tuya.bris.ac.uk

    where "username" is your username. Copy files to and from tuya using secure copy, e.g.,


    scp file.dat username@tuya.bris.ac.uk:.

    to put a file called "file.dat" in your main directory on your tuya account. Use:


    scp username@tuya.bris.ac.uk:file.dat .

    to copy a file from your main directory on tuya. More information about using ssh on University of Bristol computers is available at the UoB SSH site.


    What is MPI?

    MPI, the message passing interface, is a programming tool that facilitates network communication between the master and slave nodes of a beowulf cluster. More specifically, MPI serves to simplify the job of partitioning, sending, receiving, and recombining bytes of data over the network between the master and slave nodes during code execution. Just a handful of MPI-specific commands are required for the development of most parallel codes.

    Some MPI Links

    Lots of MPI discussion and examples are available. See the following:

    Comprehensive introductions to MPI:


    http://www-unix.mcs.anl.gov/mpi/
    http://www-unix.mcs.anl.gov/mpi/usingmpi/
    http://www.netlib.org/utk/papers/mpi-book/mpi-book.html
    http://www.nas.nasa.gov/Groups/SciCon/Tutorials/MPIintro/toc.html

    Additional simple examples:


    http://www.infospheres.caltech.edu/past_projects/fm-mpi/
    http://www.pdc.kth.se/training/Tutor/MPI/
    http://www.abo.fi/~mats/HPC1999/examples/

    Using MPICH

    Modifications required to use MPICH + PBS

    Don't reinvent the Wheel: Pointers to tools like ScalaPACK & PetSC

    Acknowledgments

    This short course was prepared by Chuck Connor (University of South Florida), Tony Payne (University of Bristol), and Ian Stewart (University of Bristol, with the help of Laura Connor and Rob Mason. The idea of a short course was first suggested by Steve Sparks and was made possible by the Advanced Studies Institute at the University of Bristol.
    chuck connor
    Last modified: Wed Jul 3 09:51:40 EDT 2002