This tutorial explores the use of two R packages: ggplot2 and ggmap, for visualizing the distribution of spatiotemporal events. To this end, we make use of spatial heat maps, i.e., a heat map that is overlaid on a geographical map where the events actually took place. The data used in this tutorial are the drone strike incidents (i.e., the “events”) in Northwest Pakistan conducted by the U.S. intelligence as part of its extensive and prolonged War on Terror. Data were collected by the Bureau of Investigative Journalism from 2004 to 2013. Source code of the tutorial is available on GitHub as well as the dataset.
We first need to load the necessary R libraries that we are going to use and the data file. We also want to subset the events that happened in 2008 or later (since there are more data points then).
We then draw a rectangular map (retrieved from Google Maps via the function
get_map) centered at the mean longitude and latitude coordinates of all the events. Because the retrieved map is a raster object, we have to convert it into a
ggplot object in order for it to be plottable.
We now visualize the density of the events over a 2D space by adding polygonal density layers over the map (using the
stat_density2d function). The density is described by the longitude and latitude coordinates of the events themselves. The density layers are “filled”, i.e., shaded, according to their density level (which is automatically calculated using the
..level.. parameter). To add more aesthetic, the transparency level (given by the parameter
alpha) of each layer is also automatically tuned according to the calculated density level. We finally color the density layers (or contours) using a defined spectrum of colors smoothed out by gradient (supported by the
We next illustrate the strike locations using the
geom_point function of the
ggplot2 package. We depict those as red circles (
shape=21, see more shapes here) and give them some level of transparency (
alpha=0.8). We then disable distracting legends and give our map a title.
Finally, we split the events up yearly using the
facet_wrap function to produce multiple “subplots” or facets, where each is a yearly data, and plot the distribution of events over the years.
This is what the resulted heat maps look like. We observe that year 2010 seems to have more densely located (and probably more intense) strikes than the rest. Check the reality here for yourself!