NYC Motor Vehicle Collisions – Street-Level Heat Map

[This article was first published on Stable Markets » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

StreetLevelMap In this post I will extend a previous analysis creating a borough-level heat map of NYC motor vehicle collisions. The data is from NYC Open Data. In particular, I will go from borough-level to street-level collisions. The processing of the code is very similar to the previous analysis, with a few more functions that map streets to colors. Below, I load the ggmap package, and the data, and only keep collisions with longitude and latitude information.

library(ggmap)

d=read.csv('.../NYPD_Motor_Vehicle_Collisions.csv')
d_clean=d[which(regexpr(',',d$LOCATION)!=-1),]

#### 1. Clean Data ####
# get long and lat coordinates from concatenated "location" var
comm=regexpr(',',d_clean$LOCATION)
d_clean$loc=as.character(d_clean$LOCATION)
d_clean$lat=as.numeric(substr(d_clean$loc,2,comm-1))
d_clean$long=as.numeric(substr(d_clean$loc,comm+1,nchar(d_clean$loc)-1))

# create year variable
d_clean$year=substr(d_clean$DATE,7,10)

I use the three functions below to process my data. The boro() function subsets to collisions with street names in a specified borough, since some collisions with coordinate data do not have street name data. The function then subsets to collisions in 2013. The accident_freq() functions calculates the frequency of collisions per street, then merges these numbers back to the collision-level data. This is important since the map needs collision-level data, for reasons that will be clear soon. The assign_col() function takes a collision-level data set (created with the accident_freq() function) for a particular borough and assigns each street a color ranging from white to a specified color (e.g. green, red, etc.). Streets with more collisions will be darker.

# functions boro() subsets to 2013 accidents in specified borough
boro=function(x){
 d_clean2=d_clean[which(d_clean$ON.STREET.NAME!='' & d_clean$BOROUGH==x),]
 d_2013_2=d_clean2[which(d_clean2$year=='2013'),c('long','lat','ON.STREET.NAME')]
return(d_2013_2)
}

# accident_freq() gets frequency of accidents per street for specified borough
accident_freq=function(x){
 tab=data.frame(table(x$ON.STREET.NAME))
 d_merge=merge(x=x,y=tab,by.x=c('ON.STREET.NAME'),by.y=c('Var1'))
 d_merge$freqPerc=round((d_merge$Freq/length(x$ON.STREET.NAME))*1000,digits=0)
 d_merge$freqPerc=ifelse(d_merge$freqPerc==0,1,d_merge$freqPerc)
return(d_merge)
}

# assign_col() assigns color shade to each street based on frequency
assign_col=function(x,c){
 pal=colorRampPalette(c('white',c))
 colors=pal(max(x$freqPerc))
 return(colors)
}

man=boro('MANHATTAN')
bronx=boro('BRONX')
brook=boro('BROOKLYN')
si=boro('STATEN ISLAND')
q=boro('QUEENS')

man_freq=accident_freq(man)
bronx_freq=accident_freq(bronx)
brook_freq=accident_freq(brook)
si_freq=accident_freq(si)
q_freq=accident_freq(q)

man_col=assign_col(man_freq,'dodgerblue')
bronx_col=assign_col(bronx_freq,'darkred')
brook_col=assign_col(brook_freq,'violet')
si_col=assign_col(si_freq,'darkgreen')
q_col=assign_col(q_freq,'darkgoldenrod4')

Finally, I use ggmap’s get_map() function to get a toner style map of NYC and add geom_path layers. There is one geom_path() layer per borough. Geom_path() connects all longitude and latitude points that are on the same street with a line or “path.” Essentially, it uses street as a grouping factor for the coordinates. All coordinates in a group are connected. Each line is then given a color determined by assign_col() using the col= parameter.

ny_plot=ggmap(get_map('New York, New York',zoom=11,maptype='toner'))

plot3=ny_plot+
 geom_path(data=man,size=1,aes(x=man$long, y=man$lat,group=man$ON.STREET.NAME),col=man_col[man_freq$freqPerc])+
 geom_path(data=bronx,size=1,aes(x=bronx$long, y=bronx$lat,group=bronx$ON.STREET.NAME),col=bronx_col[bronx_freq$freqPerc])+
 geom_path(data=brook,size=1,aes(x=brook$long, y=brook$lat,group=brook$ON.STREET.NAME),col=brook_col[brook_freq$freqPerc])+
 geom_path(data=si,size=1,aes(x=si$long, y=si$lat,group=si$ON.STREET.NAME),col=si_col[si_freq$freqPerc])+
 geom_path(data=q,size=1,aes(x=q$long, y=q$lat,group=q$ON.STREET.NAME),col=q_col[q_freq$freqPerc])+
 ggtitle('Street-Level NYC Vehicle Accidents by Borough')+
 xlab(" ")+ylab(" ")
plot3

To leave a comment for the author, please follow the link and comment on their blog: Stable Markets » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)