Lesson 5

Goals

A common problem in geologic hazard assessment is estimation of the likelihood an event (earthquake, volcanic eruption, sinkhole) will develop in a specific place or within a specific area. The goal of this lesson is to introduce techniques for estimating the likelihood that events will occur within a specific area, given some distribution of previous events. A perl script will be developed for estimating point densities. GMT methods for contouring data will be introduced.



Examples of spatial point processes

Many geologic hazards can be characterized as "point processes". Earthquake focii are point processes. Although the effects of earthquakes extend over vast areas, their focii, the point of initial slip on a fault, are quite small in volume. A first step in characterizing seismic hazards in a region is to determine the frequency and location of past earthqaukes. Earthquakes are not randomly distributed in space, they tend to cluster around the most active faults, and on a larger scale around plate margins. Therefore it is inappropriate to use a model that treats earthquakes as a homogeneous process in space. Volcanic vents cluster in basaltic volcanic fields. On the flanks of large shield volcanoes basaltic vents often occur in rift zones. Like the distribution of earthquake focii, the distribution of these vents, which are source regions for lava flows and tephra, is not usually homogeneous in space.

Consider the following map of the distribution of volcanic vents in the Pancake volcanic field, Nevada. Volcanic vents in this region form clusters in which vent density is greater than elsewhere. Furthermore, the ages of vents and lava flows (radiometric dates are shown as numbers on the maps in millions of years) shows that the vent clusters were not all active at the same time. The youngest vent cluster occurs in the NE part of the map area, the oldest cluster in the SW. Lava flows from these vents are shown in gray on the figure. Clearly the distribution of vents, and lava flow hazards, are not uniform across the map area. A nonhomogeneous estimate of the spatial distribution of events (new lava flow vents) would work best for forcasting future eruptions in this region.

Estimating spatial point processes

Kernel density estimators are used to calculate a probability surface directly from the location and timing of past, discrete events. As a result, kernel estimators are sensitive to patterns, such as clustering, commonly observed in point distributions.

A large amount of information on density esimation is to be found in B.W. Silverman, Density esimtation for statistics and data analysis, 1986, Chapman and Hall/CRC. Check out this book if you need more information on the topic!

Density estimators take the point process and create a smooth surface that reflects the likely location of future events, should they occur. The problem of exactly how to "spread density" out from pointlike features has received a lot of attention. Should the density function have the shape of a normal probability distribution? Or is it more likely that the expected distribution of future events does not fall off uniformly with distance from the locations of past events? Kernel density functions are used to mathematically describe how density spreads out from pointlike features. Essentially, an infinite number of kernels could be used to describe exactly how this occurs. One of the tasks of a geologic hazard analyst is to determine, or guess!, which kernel function will work best. Two types of kernel density functions are the Gaussian and Epanechnikov kernels.

The Gaussian kernel is given by:

ki = 1/(2 pi) exp[-1/2 (di/h)^2]

where di is the distance from the point s to the ith point (e.g. volcano) and h is the smoothing parameter.

The Epanechnikov kernel is given by:

ki = (2/pi)[1 - (di/h)^2], if (di/h)^2 < 1 and

ki = 0, otherwise

In each case, the expected density of new events about the point s, is given by:

lambda(s) = 1/(nh^2) Sum {from i=1 to n} ki.

where n is the number of previous events that are used to estimate lambda(s).

The following may help you visualize the gaussian kernel function. These plots were created using GMT, and show the expected density distribution calculated for a single point.

Simple Contour Plot

The contour map at left shows a gaussian kernal calculated for a single point, located at 500,500. Note how the probability density drops off symmetrically with distance from the point located at 500,500. This plot was created using the grdcontour command in gmt. First x,y,z data (which might be located randomlt within some region, -R, are interpolated onto a grid, using the gmt command "surface"; then this resulting grid is contoured and displayed. The resulting map can be annotated, just like any map. Here is a script-like listing of the commands involved:

# surface does the interpolation with a minimum curvature algorithm
# an input file name is specified - in this case it is gauss.out
# this containes x,y,z, data that are contoured.
# a typical input file looks like:
# 0 0 7.68103944271734e-07
# 0 10 8.6928850619859e-07
# 0 20 9.81345969958306e-07
# 0 30 1.10508227488912e-06
# 0 40 1.24131312256095e-06
# 0 50 1.39085656214834e-06
# 0 60 1.55452462129013e-06
# -I10 means the grid spacing is 10 units in each direction
# -Gfilename indicates the output file
# -R specifies the west,east,south, and north bounds of the map
surface -I10 gauss.out -Ggaussian.grd -R0/1000/0/1000
#grdcontour draws the contours from the grid
# -Jx0.005i means the map will be 1:2000 scale
# -C0.00005 means there is a 0.0005 contour interval
# -A500 means the contours are annotated every 500 units
# -B200a200f200/WSne draw the frame, 200 m tick with 200 m label, add 200 m ticks, label on
# west and south side only
# -W0.25p set line width
# -P draw in portrait mode
grdcontour gaussian.grd -Jx0.005i -C0.00005 -A0.0001 -B200a200f200/WSne -W0.25p -P -K -Y3.0i -X1.5i > gauss_contour.ps

# psxy plots the data points as solid squares
# -S specifies the size and shape
# - G0 makes the squares solid
psxy -R -Jx0.005i -Sc0.1i -G0 -O -P -K < < EOF>> gauss_contour.ps
500 500
EOF

There is more than one way to visualize this data set! The plot at right was created using the grdimage command in gmt.


# surface does the interpolation with a minimum curvature algorithm
# an input file name is specified - in this case it is gauss.out
# this containes x,y,z, data that are contoured.

Color Contour Plot


# a typical input file looks like:
# 0 0 7.68103944271734e-07
# 0 10 8.6928850619859e-07
# 0 20 9.81345969958306e-07
# 0 30 1.10508227488912e-06
# 0 40 1.24131312256095e-06
# 0 50 1.39085656214834e-06
# 0 60 1.55452462129013e-06
# -I10 means the grid spacing is 10 units in each direction
# -Gfilename indicates the output file
# -R specifies the west,east,south, and north bounds of the map
surface -I10 gauss.out -Ggaussian.grd -R0/1000/0/1000

# makecpt creates a color table (to color shade contours)
# -Cno_green specifies the basic color table
# -T specifies the range and interval
makecpt -Cno_green -T0.00001/0.0004/0.00001 > gaussian.cpt

# grdimage plots the image (color map)
# -JX6.0i is the scale (must match the following)
# -Ccn.cpt is the color scale created with makecpt
# -P portait mode (must match the following)
# -E is the dpi of the color shading
# -K more postscript to be appended to cn.ps in the following
grdimage gaussian.grd -R0/1000/0/1000 -Jx0.005i -B200g200/SWne -Cgaussian.cpt -Y3.0i -X1.5i -P -E100 -K -V > gauss_color.ps

# psxy plots the data points as solid squares
# -S specifies the size and shape
# - G0 makes the squarews solid
psxy -R -Jx0.005i -Sc0.1i -G0 -O -P -K <> gauss_color.ps
500 500
EOF

#draw the scale to show what color variation
#means in terms of values
psscale -D2.5i/-1i/6i/0.25ih -Cgaussian.cpt -B0.0002:"Density Estimate (events/km@+2@+)": -O -P -V >> gauss_color.ps

Mesh Plot

This mesh plot was created using gmt and the same input file as shown in the previous figures.

# surface does the interpolation with a minimum curvature algorithm
# an input file name is specified - in this case it is gauss.out
# this containes x,y,z, data that are contoured.
# a typical input file looks like:
# 0 0 7.68103944271734e-07
# 0 10 8.6928850619859e-07
# 0 20 9.81345969958306e-07
# 0 30 1.10508227488912e-06
# 0 40 1.24131312256095e-06
# 0 50 1.39085656214834e-06
# 0 60 1.55452462129013e-06
# -I10 means the grid spacing is 10 units in each direction
# -Gfilename indicates the output file
# -R specifies the west,east,south, and north bounds of the map
surface -I10 gauss.out -Ggaussian.grd -R0/1000/0/1000

# Now use the grdview command to get a 3D perspective
#-JZ2i controls the height ofthe image
# Additional terms are added to -B to control
# labelling of the x,y, and Z axes
# -R is expanded to include the vertial "z" region
# -Qm specifies that a wire mesh is plotted
# -E specifies the look angle (perspective) on the plot

grdview gaussian.grd -JX4i -JZ2i -B200:"East":/200:"North":/0.00025:"Density":SEwnZ -N0.00001/255/255/255 -Qm -R0/1000/0/1000/0/0.0005 -E120/30 -V > gaussian_mesh.ps


This plot was created using the grdview and grdgradient commands in gmt. The resulting figure is a "shaded relief" image. Note how the gaussian surface (hill) appears to be lit by sunlight from one side. This technique produces very realistic looking surfaces. The following commands are used:


# surface does the interpolation with a minimum curvature algorithm
# an input file name is specified - in this case it is gauss.out
# this containes x,y,z, data that are contoured.
# a typical input file looks like:
# 0 0 7.68103944271734e-07
# 0 10 8.6928850619859e-07
# 0 20 9.81345969958306e-07
# 0 30 1.10508227488912e-06
# 0 40 1.24131312256095e-06
# 0 50 1.39085656214834e-06
# 0 60 1.55452462129013e-06
# -I10 means the grid spacing is 10 units in each direction
# -Gfilename indicates the output file
# -R specifies the west,east,south, and north bounds of the map
surface -I10 gauss.out -Ggaussian.grd -R0/1000/0/1000

# makecpt creates a color table (to color shade contours)
# -Cno_green specifies the basic color table
# -T specifies the range and interval
makecpt -Cno_green -T0.00001/0.0004/0.00001 > gaussian.cpt

# use gridgradient to create an intensity grid file
# called "intensity.grd" this creates a shaded-relief
# effect - it looks like the sun is shining on the
# surface.
# -A and -Nt control the angle of the sun
grdgradient gaussian.grd -A255 -Gintensity.grd -Nt0.75


# Now use the grdview command to get a 3D perspective
#-JZ2i controls the height ofthe image
# Additional terms are added to -B to control
# labelling of the x,y, and Z axes
# -R is expanded to include the vertial "z" region
# -Qs specifies that a solid shaded relief is plotted
# -I means plot the intensity grd, for shaded relief effect
# -C means plot the surface according to a color table
# -E specifies the look angle (perspective) on the plot

grdview gaussian.grd -JX4i -JZ2i -B200:"East":/200:"North":/0.00025:"Density":SEwnZ -Qs -Iintensity.grd -Cgaussian.cpt -R0/1000/0/1000/0/0.0005 -E120/30 -V > gaussian_color_shaded.ps

Color Shaded Relief



Calculating point density with Perl

But how is the spatial density actually estimated using the equations for gaussian or epanechnikov kernels? This calculation basically involves the distance formula, and estimation of density using the kernel function. The calculation must be done repeatedly for each point (volcano, earthquake, etc) used to make the density estimate. For example, consider the volcanoes located near Yucca Mountain, Nevada, the proposed site of a high-level radioactive waste repository. The gaussian kernel density function can be used to calculate point densities about these volcanoes. A listing of the volcano locations in the Yucca Mountain region is given in the file: volcano_locations.xy. Take a quick look at this file. The first few lines are:


543780 4060380 Lathrop Wells
522120 4110340 Little Black Peak
523400 4112600 Hidden Cone
540350 4079360 Northern Cone
538840 4074120 Black Cone
537580 4071880 Red Cone

The file contains the x,y coordinates of the volcanoes and their names, separated by spaces. The volcano locations are not listed in long./lat., rather they are given in a Universal Transverse Mercator (UTM) coordinates and projection. In UTM projections,the first coordinate (e.g., 543780) is meters East of a central meridian (located at x = 500000) - this coordinate is called the Easting. The second coordinate (e.g., 4060380) is the meters north of the equator (located at 0 m). The coordinate is called the Northing. There are 60 central meridians, so the same UTM coordinates can refer to 60 different places on earth! It is a real good idea to report which meridian (or zone) the coordinates refer to. In this case, the volcanoes are located in zone 14. In the southern hemisphere, Northing is specfied as a number less than 10000000, where the equator is 10000000. Although this is confusing, the huge advantage of UTM cooridinates is that all of the units are already in meters. It is much easier to read the map and visualize distance between points if the coordinates are in meters rather than in degrees.

This file is read by a perl script and point densities are calculated on a grid. A listing of the perl script is: gaussian.pl. Study this script. Note that the variable $h is the smoothing parameter used in the gaussian kernel. Increasing the value of $h (in meters) makes a smoother point density map. Small values of $h will cause the density distribution to be high around volcanoes, and fall off in value very rapidly. Also note that point density estimates are made on a grid: $xmin, $xmax, $ymin, $ymax specify the bounds of this grid. The variable $grid_spacing controls the distance between grid points.

The units of the results point density distribution are manipulated by the script. After the point density is calculated, the values are transformed from events/square meter to events/square kilometer by multiplying by 1e6 (1000000). Then the values are transformed by log (base 10). This is done because the values very by orders of magnitude across the map area. Finally, very small values (less than 0.000000000001) are set to zero, as there is no reason to contour these!

The perl script can be run from the command line using:

perl gaussian.pl > gaussian.out

which creates a file "gaussian.out" with the x y z values need to contour the point density map.



Contouring the Results

The script script5.gmt puts many of these ideas together. Run this script from the command line by typing: ./script5.gmt. If you have trouble, remember to check the file permssions using the "ls -ll" command, and change the file permissions using the "chmod" command.

Running the script should produce a map that looks like the following:



Assignment 5

Edit the script "gaussian.pl" to create several new density maps using $h=2500, $h=5000, $h=20000. Also, plot the proposed high-level radioactive waste repository location on the maps you generate. The UTM location of the repository is approximately: 546500 E and 4087900 N. You will need to run ./gaussian.gmt each time you make these changes. Save the results as jpegs. Note the change in the probability that a volcano will erupt through the repository with changing values of the smoothing parameter, h. As usual please paste your results into your word document.


Chuck Connor
Last modified: Thu Jan 12 11:21:34 EST 2006