A common problem in geologic hazard assessment is estimation of the likelihood an event (earthquake, volcanic eruption, sinkhole) will develop in a specific place or within a specific area. The goal of this lesson is to introduce techniques for estimating the likelihood that events will occur within a specific area, given some distribution of previous events. A perl script will be developed for estimating point densities. GMT methods for contouring data will be introduced.
Many geologic hazards can be characterized as "point processes". Earthquake focii are point processes. Although the effects of earthquakes extend over vast areas, their focii, the point of initial slip on a fault, are quite small in volume. A first step in characterizing seismic hazards in a region is to determine the frequency and location of past earthqaukes. Earthquakes are not randomly distributed in space, they tend to cluster around the most active faults, and on a larger scale around plate margins. Therefore it is inappropriate to use a model that treats earthquakes as a homogeneous process in space. Volcanic vents cluster in basaltic volcanic fields. On the flanks of large shield volcanoes basaltic vents often occur in rift zones. Like the distribution of earthquake focii, the distribution of these vents, which are source regions for lava flows and tephra, is not usually homogeneous in space.
Consider the following map of the distribution of volcanic vents in the Pancake volcanic field, Nevada. Volcanic vents in this region form clusters in which vent density is greater than elsewhere. Furthermore, the ages of vents and lava flows (radiometric dates are shown as numbers on the maps in millions of years) shows that the vent clusters were not all active at the same time. The youngest vent cluster occurs in the NE part of the map area, the oldest cluster in the SW. Lava flows from these vents are shown in gray on the figure. Clearly the distribution of vents, and lava flow hazards, are not uniform across the map area. A nonhomogeneous estimate of the spatial distribution of events (new lava flow vents) would work best for forcasting future eruptions in this region.

Kernel density estimators are used to calculate a probability surface directly from the location and timing of past, discrete events. As a result, kernel estimators are sensitive to patterns, such as clustering, commonly observed in point distributions.
A large amount of information on density esimation is to be found in B.W. Silverman, Density esimtation for statistics and data analysis, 1986, Chapman and Hall/CRC. Check out this book if you need more information on the topic!
Density estimators take the point process and create a smooth surface that reflects the likely location of future events, should they occur. The problem of exactly how to "spread density" out from pointlike features has received a lot of attention. Should the density function have the shape of a normal probability distribution? Or is it more likely that the expected distribution of future events does not fall off uniformly with distance from the locations of past events? Kernel density functions are used to mathematically describe how density spreads out from pointlike features. Essentially, an infinite number of kernels could be used to describe exactly how this occurs. One of the tasks of a geologic hazard analyst is to determine, or guess!, which kernel function will work best. Two types of kernel density functions are the Gaussian and Epanechnikov kernels.
The Gaussian kernel is given by:
ki = 1/(2 pi) exp[-1/2 (di/h)^2]
where di is the distance from the point s to the ith point (e.g. volcano) and h is the smoothing parameter.
The Epanechnikov kernel is given by:
ki = (2/pi)[1 - (di/h)^2], if (di/h)^2 < 1 and
ki = 0, otherwise
In each case, the expected density of new events about the point s, is given by:
lambda(s) = 1/(nh^2) Sum {from i=1 to n} ki.
where n is the number of previous events that are used to estimate lambda(s).
The following may help you visualize the gaussian kernel function. These plots were created using GMT, and show the expected density distribution calculated for a single point.
Simple Contour Plot
| The contour map at left shows a gaussian kernal calculated for a single point, located at 500,500. Note how the probability density drops off symmetrically with distance from the point located at 500,500. This plot was created using the grdcontour command in gmt. First x,y,z data (which might be located randomlt within some region, -R, are interpolated onto a grid, using the gmt command "surface"; then this resulting grid is contoured and displayed. The resulting map can be annotated, just like any map. Here is a script-like listing of the commands involved:
|
|
There is more than one way to visualize this data set! The plot at right was created using the grdimage command in gmt. # surface does the interpolation with a minimum curvature algorithm # an input file name is specified - in this case it is gauss.out # this containes x,y,z, data that are contoured. |
Color Contour Plot
|
Mesh Plot
|
This mesh plot was created using gmt and the same input file as shown in the previous figures.
|
|
This plot was created using the grdview and grdgradient commands in gmt. The resulting figure is a "shaded relief" image. Note how the gaussian surface (hill) appears to be lit by sunlight from one side. This technique produces very realistic looking surfaces. The following commands are used: # surface does the interpolation with a minimum curvature algorithm # an input file name is specified - in this case it is gauss.out # this containes x,y,z, data that are contoured. # a typical input file looks like: # 0 0 7.68103944271734e-07 # 0 10 8.6928850619859e-07 # 0 20 9.81345969958306e-07 # 0 30 1.10508227488912e-06 # 0 40 1.24131312256095e-06 # 0 50 1.39085656214834e-06 # 0 60 1.55452462129013e-06 # -I10 means the grid spacing is 10 units in each direction # -Gfilename indicates the output file # -R specifies the west,east,south, and north bounds of the map surface -I10 gauss.out -Ggaussian.grd -R0/1000/0/1000 # makecpt creates a color table (to color shade contours) # -Cno_green specifies the basic color table # -T specifies the range and interval makecpt -Cno_green -T0.00001/0.0004/0.00001 > gaussian.cpt # use gridgradient to create an intensity grid file # called "intensity.grd" this creates a shaded-relief # effect - it looks like the sun is shining on the # surface. # -A and -Nt control the angle of the sun grdgradient gaussian.grd -A255 -Gintensity.grd -Nt0.75 # Now use the grdview command to get a 3D perspective #-JZ2i controls the height ofthe image # Additional terms are added to -B to control # labelling of the x,y, and Z axes # -R is expanded to include the vertial "z" region # -Qs specifies that a solid shaded relief is plotted # -I means plot the intensity grd, for shaded relief effect # -C means plot the surface according to a color table # -E specifies the look angle (perspective) on the plot grdview gaussian.grd -JX4i -JZ2i -B200:"East":/200:"North":/0.00025:"Density":SEwnZ -Qs -Iintensity.grd -Cgaussian.cpt -R0/1000/0/1000/0/0.0005 -E120/30 -V > gaussian_color_shaded.ps |
Color Shaded Relief
|
But how is the spatial density actually estimated using the equations for gaussian or epanechnikov kernels? This calculation basically involves the distance formula, and estimation of density using the kernel function. The calculation must be done repeatedly for each point (volcano, earthquake, etc) used to make the density estimate. For example, consider the volcanoes located near Yucca Mountain, Nevada, the proposed site of a high-level radioactive waste repository. The gaussian kernel density function can be used to calculate point densities about these volcanoes. A listing of the volcano locations in the Yucca Mountain region is given in the file: volcano_locations.xy. Take a quick look at this file. The first few lines are:
The file contains the x,y coordinates of the volcanoes and their names, separated by spaces. The volcano locations are not listed in long./lat., rather they are given in a Universal Transverse Mercator (UTM) coordinates and projection. In UTM projections,the first coordinate (e.g., 543780) is meters East of a central meridian (located at x = 500000) - this coordinate is called the Easting. The second coordinate (e.g., 4060380) is the meters north of the equator (located at 0 m). The coordinate is called the Northing. There are 60 central meridians, so the same UTM coordinates can refer to 60 different places on earth! It is a real good idea to report which meridian (or zone) the coordinates refer to. In this case, the volcanoes are located in zone 14. In the southern hemisphere, Northing is specfied as a number less than 10000000, where the equator is 10000000. Although this is confusing, the huge advantage of UTM cooridinates is that all of the units are already in meters. It is much easier to read the map and visualize distance between points if the coordinates are in meters rather than in degrees.
This file is read by a perl script and point densities are calculated on a grid. A listing of the perl script is: gaussian.pl. Study this script. Note that the variable $h is the smoothing parameter used in the gaussian kernel. Increasing the value of $h (in meters) makes a smoother point density map. Small values of $h will cause the density distribution to be high around volcanoes, and fall off in value very rapidly. Also note that point density estimates are made on a grid: $xmin, $xmax, $ymin, $ymax specify the bounds of this grid. The variable $grid_spacing controls the distance between grid points.
The units of the results point density distribution are manipulated by the script. After the point density is calculated, the values are transformed from events/square meter to events/square kilometer by multiplying by 1e6 (1000000). Then the values are transformed by log (base 10). This is done because the values very by orders of magnitude across the map area. Finally, very small values (less than 0.000000000001) are set to zero, as there is no reason to contour these!
The perl script can be run from the command line using:
perl gaussian.pl > gaussian.outwhich creates a file "gaussian.out" with the x y z values need to contour the point density map.
The script script5.gmt puts many of these ideas together. Run this script from the command line by typing: ./script5.gmt. If you have trouble, remember to check the file permssions using the "ls -ll" command, and change the file permissions using the "chmod" command.
Running the script should produce a map that looks like the following:

Edit the script "gaussian.pl" to create several new density maps using $h=2500, $h=5000, $h=20000. Also, plot the proposed high-level radioactive waste repository location on the maps you generate. The UTM location of the repository is approximately: 546500 E and 4087900 N. You will need to run ./gaussian.gmt each time you make these changes. Save the results as jpegs. Note the change in the probability that a volcano will erupt through the repository with changing values of the smoothing parameter, h. As usual please paste your results into your word document.