Motor Vehicle Collision Density in NYC

March 6, 2015
By

(This article was first published on Stable Markets » R, and kindly contributed to R-bloggers)

DensityByBorough

In a previous post, I visualized crime density in Boston using R’s ggmap package. In this post, I use ggmap to visualize vehicle accidents in New York City. R code and data are included in this post.

The data comes from NYC Open Data. My data cut ranges from 2012 to 2015. The data tracks the type of vehicle, the name of the street on which the accident occurs, as well as the longitude and latitude coordinates of the accident. Both coordinates are saved as a single character variable called “LOCATION”.

Below, I load the ggmap and gridExtra packages, load the data, drop all accidents without location coordinates, and parse the LOCATION variable to get the longitude and latitude coordinates. I also parse the date variable to create a year variable and use that variable to create two data sets: one with all vehicle accidents in 2013 and another with all vehicle accidents in 2014

 
library(ggmap)
library(gridExtra)

d=read.csv('.../NYPD_Motor_Vehicle_Collisions.csv')

d_clean=d[which(regexpr(',',d$LOCATION)!=-1),]

comm=regexpr(',',d_clean$LOCATION)
d_clean$loc=as.character(d_clean$LOCATION)
d_clean$lat=as.numeric(substr(d_clean$loc,2,comm-1))
d_clean$long=as.numeric(substr(d_clean$loc,comm+1,nchar(d_clean$loc)-1))
d_clean$year=substr(d_clean$DATE,7,10)

d_2013=d_clean[which(d_clean$year=='2013'),c('long','lat')]
d_2014=d_clean[which(d_clean$year=='2014'),c('long','lat')]

Next, I use get_map() to query Google Maps and get a map of NYC. I add a two-dimensional density layer to this map using stat_density2d(). I do this for both 2013 and 2014 data and use gridExtra’s grid.arrange() to place the maps side by side.

ny_plot=ggmap(get_map('New York, New York',zoom=12, maptype='terrain'))

plot1=ny_plot+
stat_density2d(data= d_2013, aes(x = d_2013$long, y = d_2013$lat, alpha=.75,fill=..level..),bins = 10, geom = 'polygon')+
guides(fill = guide_colorbar(barwidth = 1, barheight = 12)) +
scale_alpha(guide = FALSE)+ 
xlab(' ')+ylab(' ')+
ggtitle('Vehicle Accidents - NYC - 2013')

plot2=ny_plot+
stat_density2d(data= d_2014, aes(x = d_2014$long, y = d_2014$lat, alpha=.75,fill=..level..),bins = 10, geom = 'polygon')+
guides(fill = guide_colorbar(barwidth = 1, barheight = 12)) +
scale_alpha(guide = FALSE)+ 
xlab(' ')+ylab(' ')+
ggtitle('Vehicle Accidents - NYC - 2014')

grid.arrange(plot1, plot2,nrow=1,ncol=2) 

AnnualVehicleAccidentsNYC

Next, I plot 2013 accident densities by borough. I write a function boro() that drops all observations with a missing street name and subsets the 2013 data based on borough. I stack them together while adding borough names, then assign each borough a color in vector col_vals. I use stat_density2d’s group parameter to plot each borough’s density layers separately with different colors.

boro=function(x){
 d_clean2=d_clean[which(d_clean$ON.STREET.NAME!='' & d_clean$BOROUGH==x),]
 d_2013_2=d_clean2[which(d_clean2$year=='2013'),c('long','lat','ON.STREET.NAME')]
return(d_2013_2)
}

man=boro('MANHATTAN')
bronx=boro('BRONX')
brook=boro('BROOKLYN')
si=boro('STATEN ISLAND')
q=boro('QUEENS')

full=rbind(cbind(man,group='Manhattan'),
            cbind(bronx,group='Bronx'),
            cbind(brook,group='Brooklyn'),
            cbind(si,group='Staten Island'),
            cbind(q,group='Queens'))
Borough=full$group

col_vals=c('Manhattan'='dodgerblue', 'Bronx'='darkred',
            'Brooklyn'='violet','Staten Island'='darkgreen',
            'Queens'='darkgoldenrod4')

ny_plot=ggmap(get_map('New York, New York',zoom=11, maptype='terrain'))

plot4=ny_plot+
  stat_density2d(data=full, geom='polygon',bins = 10, aes(x=full$long,y=full$lat,fill = Borough, alpha=..level..))+
  scale_fill_manual(values=col_vals)+
  #guides(fill = guide_colorbar(barwidth = 1, barheight = 12)) +
  scale_alpha(guide = FALSE)+ 
  xlab(' ')+ylab(' ')+
  ggtitle('NYC Vehicle Accident Density by Borough, 2013')
plot4

DensityByBorough

To leave a comment for the author, please follow the link and comment on their blog: Stable Markets » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)