Normalized Frequency of Terrorism in the US

[This article was first published on Frank Portman, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve been using the Global Terrorism Database a lot lately so I decided to share an interesting plot I made with the data.

The GTD provides over 100,000 observations of terrorist incidents between 1970 and 2011. Of these, there are about 2400 observations in the USA. While this is not a large number, the graph still provides some interesting and intuitive results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
## Load libraries
library(ggplot2)
library(plyr)
library(maps)
library(stringr)

## Load terrorism data
gtd.data <- read.csv("gtd.csv", stringsAsFactors = F)



##
## Begin USA heatmap plot
##

## Subset data to only include terrorist attacks in the USA
gtd.usa <- subset(gtd.data, country_txt == "United States")

## Clean provstate column
gtd.usa$provstate <- str_replace(gtd.usa$provstate, "(U.S. State)", "")
gtd.usa$provstate <- str_replace(gtd.usa$provstate, "[(]", "")
gtd.usa$provstate <- str_replace(gtd.usa$provstate, "[)]", "")

## Trim whitespaces
gtd.usa$provstate <- str_trim(gtd.usa$provstate)

## Load US state population data
populations <- read.csv("states.csv")

## Create counts of terrorist activity in each state
counts <- count(gtd.usa, "provstate")

## Merge the populations dataset with the counts dataset
gtd.pop.merge <- merge(counts, populations, by.x = "provstate", by.y = "Name")

## Create normalized terrorism frequency by dividing frequency
## by the population of the state
gtd.pop.merge <- mutate(gtd.pop.merge, normal = freq / CENSUS2010POP)
gtd.pop.merge$normal <- log10(gtd.pop.merge$normal)

gtd.pop.merge$provstate <- tolower(gtd.pop.merge$provstate)
names(gtd.pop.merge)[1] <- "region"

## Load US state data
states <- map_data("state")

## Merge the map data with our previous dataset
merged <- merge(states, gtd.pop.merge, sort = FALSE, by = "region")

## Plot the heatmap
g <- ggplot(merged) + geom_polygon(aes(x = long, y = lat, group = group,
                                       fill = normal))

g <- g + scale_fill_gradient(low = "lightgreen", high = "blue")

g <- g + theme_bw() + labs(fill = "Normalized Frequency of Terrorism") +
     theme(legend.position = "bottom")

g <- g + xlab(NULL) + ylab(NULL)

g <- g + theme(panel.grid.minor=element_blank(),
               panel.grid.major=element_blank())

g <- g + theme(axis.text.x = element_blank(), axis.text.y = element_blank())

g <- g + ggtitle("Normalized Frequency of Terrorism in the USA")

g <- g + scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL)

g

In order to obtain meaningful results, rather than simply plot the number of terrorist incidents per state, I divided each state’s count by the 2010 state population. I know that this is not entirely correct as population levels have fluctuated (with respect to one another) from 1970-2011 but this was fine for my purposes. I noticed some clustering in the frequencies of terrorist attacks so I took a log10 transform of those numbers to spread the numbers out more smoothly.

To leave a comment for the author, please follow the link and comment on their blog: Frank Portman.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)