Mapping Traffic Fatalities

August 31, 2016
By

(This article was first published on lucaspuente.github.io/, and kindly contributed to R-bloggers)

On Monday, August 29, DJ Patil, the Chief Data Scientist in the White House Office of Science and Technology Policy, and Mark Rosekind, the Administrator of the National Highway Traffic Safety Administration (NHTSA), announced the release of a data set documenting all traffic fatalities occurring in the United States in 2015. As part of their release, they issued a “call to action” for data scientists and analysts to “jump in and analyze it.” This post does exactly that by plotting these fatalities and providing the code for others to reproduce and extend the analysis.

Step 1: Download and Clean the Data

The NHTSA made downloading this data set very easy. Simply visit ftp://ftp.nhtsa.dot.gov/fars/2015/National/ and download the FARS2015NationalDBF.zip file, unzip it, and load into R.

library(foreign)
accidents <- read.dbf("FARS2015NationalDBF/accident.dbf")

Since the goal here is to map the traffic fatalities, I also recommend subsetting the data to only include rows that have valid coordinates:

accidents <- subset(accidents, LONGITUD!=999.99990 &  LONGITUD!=888.88880 & LONGITUD!=777.77770)

Also, the map we’ll be producing will only include the lower 48 states, so we want to further subset the data to exclude Alaska and Hawaii:

cont_us_accidents<-subset(accidents, STATE!=2 & STATE!=15)

We also need to load in data on state and county borders to make our map more interpretable – without this, there would be no borders on display. Fortunately, the map_data function that’s part of the ggplot2 package makes this step very easy:

library(ggplot2)
county_map_data<-map_data("county")
state_map <- map_data("state")

Step 2: Plot the Data

Plotting the data using ggplot is also not particularly complicated. The most important thing is to use layers. We’ll first add a polygon layer to a blank ggplot object to map the county borders in light grey and then subsequently add polygons to map the state borders. Then, we’ll add points to show exactly where in the (lower 48) United States traffic fatalities occurred in 2015, plotting these in red, but with a high level of transparency (alpha=0.05) to help prevent points from obscuring one another.

map<-ggplot() + 
  #Add county borders:
  geom_polygon(data=county_map_data, aes(x=long,y=lat,group=group), colour = alpha("grey", 1/4), size = 0.2, fill = NA) +
  #Add state borders:
  geom_polygon(data = state_map, aes(x=long,y=lat,group=group), colour = "grey", fill = NA) +
  #Add points (one per fatality):
  geom_point(data=cont_us_accidents, aes(x=LONGITUD, y=LATITUDE), alpha=0.05, size=0.5, col="red") +
  #Adjust the map projection
  coord_map("albers",lat0=39, lat1=45) +
  #Add a title:
  ggtitle("Traffic Fatalities in 2015") +
  #Adjust the theme:
  theme_classic() +
  theme(panel.border = element_blank(),
        axis.text = element_blank(),
        line = element_blank(),
        axis.title = element_blank(),
        plot.title = element_text(size=40, face="bold", family="Avenir Next"))

Step 3: View the Finished Product

With this relatively simple code, we produce a map that clearly displays the location of 2015’s traffic fatalities:

Hopefully with this post you’ll be well on the way to making maps of your own and can start exploring this data set and others like it. If you have any questions, please reach out on twitter. I’m available @lucaspuente.

To leave a comment for the author, please follow the link and comment on their blog: lucaspuente.github.io/.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)