Mapping San Francisco crime

December 30, 2014
By

(This article was first published on SHARP SIGHT LABS » r-bloggers, and kindly contributed to R-bloggers)

san_francisco_crime_map_2014_ggplot2

When I was working as a data scientist at Apple in Silicon Valley, I’d drive up to San Francisco on nights and weekends to meet a girl for dinner or go to a meetup.

I sort of fell in love with the city, and have recently been checking out datasets on DataSF so I could do some geospatial visualizations of San Francisco.

As it turns out, much like Chicago, and Philadelphia, crime data is readily available. So, I downloaded the crime data and started visualizing in R using ggplot2.

The above is a map is the result. It’s a map of 2014 SF crime data through mid December.

What’s remarkable is that the plotting code (the code that creates the map itself) is only 12 lines of code. And of those 12, the vast majority of the code is just formatting and subtle tweaks to aesthetic features to give it the look I wanted. (The look-and-feel of it was originally inspired by a map of London bike and pollution data at spatial.ly.)

library(ggplot2)

#################################
# GET CRIME DATA AND SF GEO DATA
#################################


#------------------------------------------
# Download the zipped SF crime data (2014)
#  and save it to the working directory
#------------------------------------------
download.file("http://www.sharpsightlabs.com/wp-content/uploads/2014/12/sf_crime_YTD-2014-12_REDUCED.txt.zip", destfile="sf_crime_YTD-2014-12_REDUCED.txt.zip")

#------------------------------
# Unzip the SF crime data file
#------------------------------
unzip("sf_crime_YTD-2014-12_REDUCED.txt.zip")

#------------------------------------
# Read crime data into an R dataframe
#------------------------------------
df.sf_crime <- read.csv("sf_crime_YTD-2014-12_REDUCED.txt")

#------------------------------
# Download water boundaries
#  and neighborhood boundaries
#------------------------------
df.sf_neighborhoods <- read.csv(url("http://www.sharpsightlabs.com/wp-content/uploads/2014/12/sf_neighborhood_boundaries.txt"))
df.sf_water <- read.csv(url("http://www.sharpsightlabs.com/wp-content/uploads/2014/12/sf_water_boundaries.txt"))



################
# PLOT THE DATA
################
ggplot() +
  geom_polygon(data=df.sf_neighborhoods,aes(x=long,y=lat,group=group) ,fill="#404040",colour= "#5A5A5A", lwd=0.05) +
  geom_polygon(data=df.sf_water, aes(x=long, y=lat, group=group),colour= "#708090", fill="#708090") +
  geom_point(data=df.sf_crime, aes(x=df.sf_crime$X, y=df.sf_crime$Y), color="#FFFF3309", fill="#FFFF3309", size=1.3) +
  geom_polygon(data=df.sf_neighborhoods, aes(x=long,y=lat, group=group) ,fill=NA,colour= "#DDDDDD55", lwd=.3) +
  ggtitle("San Francisco Crime (2014)") +
  theme(panel.background = element_rect(fill="#708090")) +
  theme(axis.title = element_blank()) +
  theme(axis.text = element_blank()) +
  theme(axis.ticks = element_blank()) +
  theme(panel.grid = element_blank()) +
  theme(plot.title = element_text(family="Trebuchet MS", size=38, face="bold", hjust=0, color="#777777"))





 
Said differently, creating maps in R using ggplot2 is not that difficult. You just need to understand how ggplot2 works.

As I’ve said before, ggplot2 has a deep syntactical structure. Once you know that structure, seemingly complex visualizations become much, much easier to create. In fact, owing to the deep structure of how ggplot2 works, this map is basically just a sophisticated scatterplot.

To be clear, there is a lot of data-manipulation and prep-work that I didn’t show here.

You also need to be able to build a plot like this iteratively. That is, you need to have a solid understanding of the design process for creating a visualization like this.

But at it’s core, this visualization isn’t as difficult to create as it might seem.

I’ll write up some in-depth material showing you how to make a visualization like this, step-by-step. If you’re interested in learning how to produce something like this, sign up for the email list and I’ll let you know when the in-depth tutorial is available.

The post Mapping San Francisco crime appeared first on SHARP SIGHT LABS.

To leave a comment for the author, please follow the link and comment on their blog: SHARP SIGHT LABS » r-bloggers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)