Crime Analysis – Denver-Part 2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Getting More Granular
Where we’re going
In having noticed the crime rates heading up over the last few years, taking a better look seemed more important. I want to first look at “CRIME” before looking into “TRAFFIC” in the data set. It sounds more interesting and I hope the results don’t keep me up at night.
What we’ll do in this post
- Load the csv, format the data
- This will all be hidden and can be found in the previous post (Part 1)
- Look into apparent growth in crime rates from 2012 – 2014
- We’ll focus only on those that fit the “ISCRIME” definition and not “ISTRAFFIC”
Let’s dive in!
Exploration of Data
Data provided by http://data.denvergov.org
Data Import & Formatting – shown in prior post: Crime Analysis – Denver-Part 2
Looking at Crime Incidents by Year
df = data %>% filter(IS_CRIME==1) %>% filter(year!=max(year(date))) %>% group_by(year) %>% summarise(incidents=sum(IS_CRIME)) %>% arrange(year) p = ggplot(df,aes(x=year,y=incidents,label=incidents)) p + geom_bar(stat='identity') + geom_text(fontface='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Volume by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5))
df = data %>% filter(IS_CRIME==1) %>% filter(year!=max(year(date))) %>% group_by(year) %>% summarise(incidents=sum(IS_CRIME)) %>% arrange(year) %>% mutate(year,YoYPercentageChange=round(100*(incidents-lag(incidents))/lag(incidents)),0) df = df[!is.na(df$YoYPercentageChange),] p = ggplot(df,aes(x=year,y=YoYPercentageChange,label=YoYPercentageChange)) p + geom_bar(stat='identity') + geom_text(fontface='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Percentage Change Year-Over-Year') + xlab('Year') + ylab('YoY Incident % Change') + theme(plot.title = element_text(hjust = 0.5))
Observations* * Crime rose the most between 2013 and 2012 (39% increase) * Crime increased each year after but at a decreasing rate * Examine years 2012 – 2014 to see growth changes
Highest volume of “ISCRIME” types
Identify the offense by OFFENSECATEGORY_ID and exclude months we have not seen so far this year.
#Isolate Years 2012 - 2014 data = data[data$year <= 2014 & data$year >= 2012,] #Sum up all incidents IS_CRIME AND IS_TRAFFIC maxYear = max(data$year) maxMonthYTD = max(data$month[data$year==maxYear]) #Look into IS_CRIME only df = data %>% filter(IS_CRIME==1) %>% group_by(year,OFFENSE_CATEGORY_ID) %>% summarise(incidents=sum(IS_CRIME)) %>% arrange(desc(incidents)) p = ggplot(df,aes(x=factor(year),y=incidents,fill=year)) p + geom_bar(stat='identity') + ggtitle('Crime Incidents Reported by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3)
Observations
It would appear as if “all-other-crimes” has moved the needle the most between 2012 – 2014. This is not a very specific category. It’s also worth noticing that “other-crimes-against-persons” has grown as well. Both of these leads to some speculation that perhaps these vague types of crimes started being reported during this period and perhaps hadn’t been documented before.
Growth categories:
- “larceny”
- “drug-alcohol”
- “public-disorder”
Declining categories:
- “theft-from-motor-vehicle”
- “robbery”
- “burglary”
Many of the other categories have a much lower volume of incidents. Growth is more difficult to see in visualizatoins for these cases.
Here’s a look at growth year-over-year:
df2 = df %>% group_by(OFFENSE_CATEGORY_ID) %>% arrange(OFFENSE_CATEGORY_ID,year) %>% mutate(year,YoYchange=round((incidents-lag(incidents))),0) %>% filter(year != 2012) p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange)) p + geom_bar(stat='identity') + ggtitle('Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=0.5, size=5,col='red', fontface='bold')
Here’s a look at % growth year-over-year:
df2 = df %>% group_by(OFFENSE_CATEGORY_ID) %>% arrange(OFFENSE_CATEGORY_ID,year) %>% mutate(year,YoYchange=round(100*((incidents-lag(incidents))/lag(incidents))),0) %>% filter(year != 2012) p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange)) p + geom_bar(stat='identity') + ggtitle('% Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY % Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=1,col='red',size=5,fontface='bold')
Observations
- “all-other-crimes” is the outright leader in change in both volume and percentage growth year-over-year with an astonishing 380% increase between 2012 and 2013
- “drug-alcohol” grew by 173% between 2012 and 2013 but dropped down to only 27% growth the next year
- “murder” didn’t change too much in volume compared to everything else (swinging up 7 and down 10) but was a 19% growth and a 23% decline in 2013 and 2014 respectively
Final Thoughts (for now)
Due to the vague nature of the types of crimes which grew the most, I can’t determine exactly what happened in Denver during 2013. In the less vague crimes, “drug-alcohol” saw the largest increase. This was followed by “public-disorder” and perhaps there’s a relationship there. My assumption is that one may perhaps cause the other…
I’m still curious about the seasonality and month-to-month effects. Perhaps certain types of crimes are more common during certain times. I’m also very interested to see if a new population was perhaps added to the mix in 2013. If a certain part of Denver was added in 2013 that would certainly help to explain the situation.
What I’ll do in the next crime posts
- Look for patterns by location
- Lay out some visualizations on maps
- Try to identify areas with high volumes of traffic incidents (maybe I can avoid a ticket)
- Answer the question: What types of crimes have grown the most in the last 5 years?
Code used in this post is on my GitHub
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.