Crime Analysis – Denver-Part 2

November 17, 2016
By

(This article was first published on R-Projects - Stoltzmaniac, and kindly contributed to R-bloggers)

Getting More Granular

Where we’re going
In having noticed the crime rates heading up over the last few years, taking a better look seemed more important. I want to first look at “CRIME” before looking into “TRAFFIC” in the data set. It sounds more interesting and I hope the results don’t keep me up at night.

What we’ll do in this post

  • Load the csv, format the data
    • This will all be hidden and can be found in the previous post (Part 1)
  • Look into apparent growth in crime rates from 2012 – 2014
  • We’ll focus only on those that fit the “ISCRIME” definition and not “ISTRAFFIC”

Let’s dive in!

Exploration of Data
Data provided by http://data.denvergov.org

Data Import & Formatting – shown in prior post: Crime Analysis – Denver-Part 2

Looking at Crime Incidents by Year

df = data %>%  
  filter(IS_CRIME==1) %>%
  filter(year!=max(year(date))) %>%
  group_by(year) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(year)

p = ggplot(df,aes(x=year,y=incidents,label=incidents))  
p + geom_bar(stat='identity') + geom_text(fontface='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Volume by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5))  

barplotCrime

df = data %>%  
  filter(IS_CRIME==1) %>%
  filter(year!=max(year(date))) %>%
  group_by(year) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(year) %>%
  mutate(year,YoYPercentageChange=round(100*(incidents-lag(incidents))/lag(incidents)),0)
df = df[!is.na(df$YoYPercentageChange),]  
p = ggplot(df,aes(x=year,y=YoYPercentageChange,label=YoYPercentageChange))  
p + geom_bar(stat='identity') + geom_text(fontface='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Percentage Change Year-Over-Year') + xlab('Year') + ylab('YoY Incident % Change') + theme(plot.title = element_text(hjust = 0.5))  

barplotCrimeChange

Observations*
* Crime rose the most between 2013 and 2012 (39% increase)
* Crime increased each year after but at a decreasing rate
* Examine years 2012 – 2014 to see growth changes

Highest volume of “ISCRIME” types
Identify the offense by OFFENSECATEGORY_ID and exclude months we have not seen so far this year.

#Isolate Years 2012 - 2014
data = data[data$year <= 2014 & data$year >= 2012,]  

#Sum up all incidents IS_CRIME AND IS_TRAFFIC
maxYear = max(data$year)  
maxMonthYTD = max(data$month[data$year==maxYear])

#Look into IS_CRIME only
df = data %>%  
  filter(IS_CRIME==1) %>%
  group_by(year,OFFENSE_CATEGORY_ID) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(desc(incidents))

p = ggplot(df,aes(x=factor(year),y=incidents,fill=year))  
p + geom_bar(stat='identity') + ggtitle('Crime Incidents Reported by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3)

barplotCrimeCategories

Observations
It would appear as if “all-other-crimes” has moved the needle the most between 2012 – 2014. This is not a very specific category. It’s also worth noticing that “other-crimes-against-persons” has grown as well. Both of these leads to some speculation that perhaps these vague types of crimes started being reported during this period and perhaps hadn’t been documented before.

  • Growth categories:

    • “larceny”
    • “drug-alcohol”
    • “public-disorder”
  • Declining categories:

    • “theft-from-motor-vehicle”
    • “robbery”
    • “burglary”

Many of the other categories have a much lower volume of incidents. Growth is more difficult to see in visualizatoins for these cases.

Here’s a look at growth year-over-year:

df2 = df %>%  
  group_by(OFFENSE_CATEGORY_ID) %>%
  arrange(OFFENSE_CATEGORY_ID,year) %>%
  mutate(year,YoYchange=round((incidents-lag(incidents))),0) %>% filter(year != 2012)

p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange))  
p + geom_bar(stat='identity') + ggtitle('Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=0.5, size=5,col='red', fontface='bold')  

barplotCrimeCategoryChange

Here’s a look at % growth year-over-year:

df2 = df %>%  
  group_by(OFFENSE_CATEGORY_ID) %>%
  arrange(OFFENSE_CATEGORY_ID,year) %>%
  mutate(year,YoYchange=round(100*((incidents-lag(incidents))/lag(incidents))),0) %>% filter(year != 2012)

p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange))  
p + geom_bar(stat='identity') + ggtitle('% Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY % Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=1,col='red',size=5,fontface='bold')  

barplotCrimeCategoryPercentageChange

Observations

  • “all-other-crimes” is the outright leader in change in both volume and percentage growth year-over-year with an astonishing 380% increase between 2012 and 2013
  • “drug-alcohol” grew by 173% between 2012 and 2013 but dropped down to only 27% growth the next year
  • “murder” didn’t change too much in volume compared to everything else (swinging up 7 and down 10) but was a 19% growth and a 23% decline in 2013 and 2014 respectively

Final Thoughts (for now)
Due to the vague nature of the types of crimes which grew the most, I can’t determine exactly what happened in Denver during 2013. In the less vague crimes, “drug-alcohol” saw the largest increase. This was followed by “public-disorder” and perhaps there’s a relationship there. My assumption is that one may perhaps cause the other…

I’m still curious about the seasonality and month-to-month effects. Perhaps certain types of crimes are more common during certain times. I’m also very interested to see if a new population was perhaps added to the mix in 2013. If a certain part of Denver was added in 2013 that would certainly help to explain the situation.

What I’ll do in the next crime posts

  • Look for patterns by location
  • Lay out some visualizations on maps
  • Try to identify areas with high volumes of traffic incidents (maybe I can avoid a ticket)
  • Answer the question: What types of crimes have grown the most in the last 5 years?

Code used in this post is on my GitHub

To leave a comment for the author, please follow the link and comment on their blog: R-Projects - Stoltzmaniac.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)