Last week I looked at time effects of the shootingtracker database. This week I will look at the states. Some (smaller) states never made it on the database. Other states, far too frequently. The worst of these California. After correcting for population size, Washington DC with 0.8 shootings per year per 100000 inhabitants sticks out. Subsequently, the observed incidences have been shrunk towards the common distribution. In these posterior estimates Massachusetts is best while Washington DC, Illinois and Louisiana are worst.
Data are from the shootingtracker database and have been updated July 25. The last entry in the database is July 25. The script is a slightly adapted version from last week. Most notably, I did not notice last week the database uses KS for Kansas, but KA is what my other data uses. Note that in order make this post readable, all code is at a separate section.
Adding state information
Assuming a beta prior and binomial distribution for number of shootings, Jags is used to estimate posteriors. The one noticeable thing in the model is the prior for b. b really wants to be high, hence the prior is quite a bit wider than usual. The current values seem to be acceptable, see also the output at the bottom of this post.
r14 <- readLines('raw14.txt')
r1 <- c(r13,r14)
r2 <- gsub('\\[[a-zA-Z0-9]*\\]','',r1)
r3 <- gsub('^ *$','',r2)
r4 <- r3[r3!='']
r5 <- gsub('\\t$','',r4)
r6 <- gsub('\\t References$','',r5)
r7 <- read.table(textConnection(r6),
r7$Location[r7$Location==’Washington DC’] <-
r8 <- read.table(textConnection(as.character(r7$Location)),
r8$State <- gsub(' ','',r8$State)
r8$State[r8$State==’Tennessee’] <- 'TN'
r8$State[r8$State==’Ohio’] <- 'OH'
r8$State[r8$State %in% c(‘Kansas’,’KA’)] <- 'KS'
r8$State[r8$State==’Louisiana’] <- 'LA'
r8$State[r8$State==’Illinois’] <- 'IL'
r8$State <- toupper(r8$State)
r7$StateAbb <- r8$State
r7$Location <- r8$Location
r7 <- r7[! (r7$State %in% c( 'PUERTORICO','PR')),]
r7$Date <- gsub('/13$','/2013',r7$Date)
r7$date <- as.Date(r7$Date,format="%m/%d/%Y")
r10$incidence <- 100000*r10$x/r10$Population*365/
r10$pred <- apply(samples,2,mean)
r10$l05 <- apply(samples,2,function(x) quantile(x,.05))
r10$l95 <- apply(samples,2,function(x) quantile(x,.95))
r11 <- subset(r10,,c(State,l05,l95,pred,incidence))
r11 <- r11[order(r11$pred),]
r11$State <- factor(r11$State,levels=r11$State)
p <- ggplot(r11, aes(y=pred, x=State))
limits <- aes(ymin =l05, ymax=l95)
dodge <- position_dodge(width=0.9)
p +geom_point(position=dodge, stat=”identity”) +
geom_errorbar(limits, position=dodge, width=0.25) +
theme(legend.position = “bottom”) +
position=dodge, stat=”identity”,col=’red’) +