Beyond Basic R – Mapping

August 15, 2018
By

(This article was first published on The USGS OWI blog , and kindly contributed to R-bloggers)

Introduction

There are many different R packages for dealing with spatial data. The main distinctions between them involve the types of data they work with — raster or vector — and the sophistication of the analyses they can do. Raster data can be thought of as pixels, similar to an image, while vector data consists of points, lines, or polygons. Spatial data manipulation can be quite complex, but creating some basic plots can be done with just a few commands. In this post, we will show simple examples of raster and vector spatial data for plotting a watershed and gage locations, and link to some other more complex examples.

Setting up

First, let’s download an example shapefile (a polygon) of a HUC8 from western Pennsylvania, using the sbtools package to access ScienceBase. The st_read function from the sf (for “simple features”) package reads in the shapefile. We will be using sf throughout these examples to manipulate the points and polygon for the gages and HUC. Then we’ll retrieve gages with discharge from this watershed using the dataRetrieval package. Both dataRetrieval and sbtools are covered in our USGS Packages curriculum.

library(sbtools)
library(dataRetrieval)
library(sf)

item_file_download(sb_id = "5a83025ce4b00f54eb32956b", 
                   names = "huc8_05010007_example.zip", 
                   destinations = "huc8_05010007_example.zip", 
                   overwrite_file = TRUE)
## [1] "huc8_05010007_example.zip"
unzip('huc8_05010007_example.zip', overwrite = TRUE)
#note st_read takes the folder, not a particular file
huc_poly <- st_read('huc8_05010007_example')
## Reading layer `wbdhu8_alb_simp' from data source `D:\LADData\RCode\owi-blog\content\huc8_05010007_example' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 9 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -79.45512 ymin: 39.91875 xmax: -78.55573 ymax: 40.77377
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
class(huc_poly)
## [1] "sf"         "data.frame"
str(huc_poly)
## Classes 'sf' and 'data.frame':   1 obs. of  10 variables:
##  $ REGION    : Factor w/ 1 level "Ohio Region": 1
##  $ SUBREGION : Factor w/ 1 level "Allegheny": 1
##  $ BASIN     : Factor w/ 1 level "Allegheny": 1
##  $ SUBBASIN  : Factor w/ 1 level "Conemaugh": 1
##  $ HUC_2     : Factor w/ 1 level "05": 1
##  $ HUC_4     : Factor w/ 1 level "0501": 1
##  $ HUC_6     : Factor w/ 1 level "050100": 1
##  $ HUC_8     : Factor w/ 1 level "05010007": 1
##  $ HU_8_STATE: Factor w/ 1 level "PA": 1
##  $ geometry  :sfc_POLYGON of length 1; first list element: List of 1
##   ..$ : num [1:256, 1:2] -79 -79 -79 -79 -79 ...
##   ..- attr(*, "class")= chr  "XY" "POLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA
##   ..- attr(*, "names")= chr  "REGION" "SUBREGION" "BASIN" "SUBBASIN" ...
st_geometry(huc_poly)
## Geometry set for 1 feature 
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -79.45512 ymin: 39.91875 xmax: -78.55573 ymax: 40.77377
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
st_bbox(huc_poly)
##      xmin      ymin      xmax      ymax 
## -79.45512  39.91875 -78.55573  40.77377
st_crs(huc_poly)
## Coordinate Reference System:
##   EPSG: 4326 
##   proj4string: "+proj=longlat +datum=WGS84 +no_defs"
huc_gages <- whatNWISdata(huc = "05010007", parameterCd = "00060", service="uv")
head(huc_gages)
##     agency_cd  site_no                                        station_nm
## 197      USGS 03039925 North Fork Bens Creek at North Fork Reservoir, PA
## 591      USGS 03040000                  Stonycreek River at Ferndale, PA
## 621      USGS 03041029                Conemaugh River at Minersville, PA
## 731      USGS 03041500                     Conemaugh River at Seward, PA
## 824      USGS 03042000                  Blacklick Creek at Josephine, PA
## 964      USGS 03042280                  Yellow Creek near Homer City, PA
##     site_tp_cd dec_lat_va dec_long_va coord_acy_cd dec_coord_datum_cd
## 197         ST   40.26619   -79.01669            S              NAD83
## 591         ST   40.28563   -78.92058            S              NAD83
## 621         ST   40.34139   -78.92611            S              NAD83
## 731         ST   40.41924   -79.02614            S              NAD83
## 824         ST   40.47701   -79.18670            S              NAD83
## 964         ST   40.57257   -79.10337            S              NAD83
##       alt_va alt_acy_va alt_datum_cd   huc_cd data_type_cd parm_cd stat_cd
## 197                      05010007           uv   00060    
## 591  1184.06        .01       NGVD29 05010007           uv   00060    
## 621     1140        .01       NGVD29 05010007           uv   00060    
## 731  1076.01        .01       NGVD29 05010007           uv   00060    
## 824   954.46        .01       NAVD88 05010007           uv   00060    
## 964     1140        .01       NGVD29 05010007           uv   00060    
##      ts_id loc_web_ds medium_grp_cd parm_grp_cd  srs_id access_cd
## 197 221326                  wat         1645423         0
## 591 122456                  wat         1645423         0
## 621 122464                  wat         1645423         0
## 731 122468                  wat         1645423         0
## 824 122471                  wat         1645423         0
## 964 122477                  wat         1645423         0
##     begin_date   end_date count_nu
## 197 1987-10-03 1998-09-26     4011
## 591 1991-10-01 2018-08-16     9816
## 621 2001-12-13 2018-08-16     6090
## 731 1991-10-01 2018-08-16     9816
## 824 1991-10-15 2018-08-16     9802
## 964 1991-10-01 2018-08-16     9816

The huc_poly object is a new type of object that we haven’t seen — it has classes sf as well as data.frame. Looking inside the object with the str command, we can see it is structured very much like a data.frame with several factor columns, except for the geometry column, which is of type sfc_POLYGON. sf provides various functions to extract useful information from this kind object, generally prefixed with st_. st_geometry extracts the entire geometry part of the object; st_bbox extracts the bounding box from the geometry; and st_crs extracts the coordinate reference system. There are others, but we will use these three to get the parts of the sf object we need for plotting.

Now that we understand this new object, let’s make some maps.

Raster map example

For the raster map, we will use the ggmap package to create a road and satellite basemaps for the HUC. Since the basemaps that ggmap uses are quite detailed, they are too large to include with the package and must be retrieved from the web with the get_map function. For the location argument, we are getting the bbox from the huc_poly object. st_bbox returns the bbox in the format we need, except for the names, which we add with setNames. The ggmap function is analogous to the ggplot function in the ggplot2 package that you have likely already used. It creates the base map, which we can then add things to. Many of the commands used here are from the ggplot2 package (ggmap imports them), and others could be used to further customize this map.

Note that ggmap is probably not a good choice if you need your data to be in a particular projection. Compared to base plotting, it provides simplicity at the cost of control.

library(ggmap)

bbox <- setNames(st_bbox(huc_poly), c("left", "bottom", "right", "top"))
#setting zoom to 9 gives us a bit of padding around the bounding box
basemap_streets <- get_map(maptype = "roadmap", location = bbox, zoom = 9)
basemap_satellite <- get_map(maptype = "satellite", location = bbox, zoom = 9)
street_map <- ggmap(basemap_streets) 
satellite_map <- ggmap(basemap_satellite)
print(street_map)

plain base maps

print(satellite_map)

plain base maps

Now we can start adding to our maps. First, we convert the huc_gages data.frame to an sf object using st_as_sf, assigning it the same coordinate reference system as huc_poly using st_crs. ggplot functions like geom_sf and geom_text add to your base map.

huc_gages_sf <- st_as_sf(huc_gages, coords = c("dec_long_va", "dec_lat_va"), 
                         crs = st_crs(huc_poly), remove = FALSE)
satellite_map + geom_sf(data = huc_poly,
                        inherit.aes = FALSE,
                        color = "white", fill = NA) +
  geom_sf(data = huc_gages_sf, inherit.aes = FALSE, color = "red") +
  geom_text(data = huc_gages_sf, 
            aes(label = site_no, x = dec_long_va, y = dec_lat_va),
            hjust = 0, size=2.5, nudge_x = 0.02, col = "yellow")

base maps with HUC and gages

street_map + geom_sf(data = huc_poly,
                        inherit.aes = FALSE,
                        color = "black", fill = NA) +
  geom_sf(data = huc_gages_sf, inherit.aes = FALSE, color = "red") +
  geom_text(data = huc_gages_sf, aes(label = site_no, x = dec_long_va, y = dec_lat_va),
             hjust = 0, size=2.5, nudge_x = 0.02, col = "black")

base maps with HUC and gages

Vector map example

If we don’t want any raster background to our maps, we can use base plotting and the maps package. This style of map can be nicer for insets or large scale maps that would be cluttered with a raster background. The maps package provides easily accessible political boundary maps, that can be overlaid with other shapefiles. As with regular base plotting, you can think of creating maps like painting — every layer has to go on in the right order. You can’t remove things without starting over. Fortunately, you can start over with just a few keystrokes since you are scripting your plot! Run these commands one by one to see the map take shape — first we create a blank state map, then add county lines in white, then the HUC boundary, then the gage points, and then the legend. Note that we use the st_geometry function inside of the plot command so that we only plot the huc_poly and huc_gages_sf geometry, and not the other information in their data frames.

library(maps)
map(database = 'state', regions = 'Pennsylvania', col = "tan", fill = TRUE, border = NA)
#this draws all PA counties since the regions argument uses partial matching
map(database = 'county', regions = 'Pennsylvania', col = "white", fill = FALSE, add = TRUE)
plot(st_geometry(huc_poly), col = NA, add = TRUE)
plot(st_geometry(huc_gages_sf), add = TRUE, col = "red", pch = 19, cex = 0.7)
legend("bottomright", legend = c("Gage", "Subbasin boundary"), pch = c(19,NA), lty = c(NA, 1),
       col = c("red", "black"), title = "Legend")
title("Conemaugh Subbasin")

Polygon map of Pennsylvania

Similarly, we can create a map zoomed in to the HUC polygon. Note that we set the x and y limits of the map by extracting the limits of the bbox object we created earlier. We can use the names left, right, etc. because bbox is a named vector.

  map(database = 'county', regions = 'Pennsylvania', col = "lightgray", 
      xlim = bbox[c('left', 'right')], ylim = bbox[c('bottom', 'top')])
 plot(st_geometry(huc_poly), col = "dodgerblue", border = NA, add = TRUE)
  box()
 plot(st_geometry(huc_gages_sf), add = TRUE, col = "red", pch = 19, cex = 0.7)
  legend("bottomleft", legend = c("Gage", "Conemaugh subbasin"), pch = c(19,NA), lty = c(NA, 1),
       col = c("red", "dodgerblue"), title = "Legend", lwd = c(1,15), bg = "wheat")
  text(x = huc_gages$dec_long_va, y = huc_gages$dec_lat_va, labels = huc_gages$site_no,
       pos = 4, cex = 0.7)

Polygon map zoomed to HUC

Other packages and examples

Like plotting in R, there are endless intricacies to making maps, and we are only really scratching the surface here. Some other packages that you may find useful for certain applications include:

Also, check out our additional topics in R page for links to some other tutorials.

To leave a comment for the author, please follow the link and comment on their blog: The USGS OWI blog .

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)