Normalized Frequency of Terrorism in the US

[This article was first published on Frank Portman, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve been using the Global Terrorism Database a lot lately so I decided to share an interesting plot I made with the data.

The GTD provides over 100,000 observations of terrorist incidents between 1970 and 2011. Of these, there are about 2400 observations in the USA. While this is not a large number, the graph still provides some interesting and intuitive results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
<span class="line"><span class="c1">## Load libraries</span>
</span><span class="line"><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
</span><span class="line"><span class="kn">library</span><span class="p">(</span>plyr<span class="p">)</span>
</span><span class="line"><span class="kn">library</span><span class="p">(</span>maps<span class="p">)</span>
</span><span class="line"><span class="kn">library</span><span class="p">(</span>stringr<span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Load terrorism data</span>
</span><span class="line">gtd.data <span class="o"><-</span> read.csv<span class="p">(</span><span class="s">"gtd.csv"</span><span class="p">,</span> stringsAsFactors <span class="o">=</span> <span class="bp">F</span><span class="p">)</span>
</span><span class="line">
</span><span class="line">
</span><span class="line">
</span><span class="line"><span class="c1">##</span>
</span><span class="line"><span class="c1">## Begin USA heatmap plot</span>
</span><span class="line"><span class="c1">##</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Subset data to only include terrorist attacks in the USA</span>
</span><span class="line">gtd.usa <span class="o"><-</span> <span class="kp">subset</span><span class="p">(</span>gtd.data<span class="p">,</span> country_txt <span class="o">==</span> <span class="s">"United States"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Clean provstate column</span>
</span><span class="line">gtd.usa<span class="o">$</span>provstate <span class="o"><-</span> str_replace<span class="p">(</span>gtd.usa<span class="o">$</span>provstate<span class="p">,</span> <span class="s">"(U.S. State)"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</span><span class="line">gtd.usa<span class="o">$</span>provstate <span class="o"><-</span> str_replace<span class="p">(</span>gtd.usa<span class="o">$</span>provstate<span class="p">,</span> <span class="s">"[(]"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</span><span class="line">gtd.usa<span class="o">$</span>provstate <span class="o"><-</span> str_replace<span class="p">(</span>gtd.usa<span class="o">$</span>provstate<span class="p">,</span> <span class="s">"[)]"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Trim whitespaces</span>
</span><span class="line">gtd.usa<span class="o">$</span>provstate <span class="o"><-</span> str_trim<span class="p">(</span>gtd.usa<span class="o">$</span>provstate<span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Load US state population data</span>
</span><span class="line">populations <span class="o"><-</span> read.csv<span class="p">(</span><span class="s">"states.csv"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Create counts of terrorist activity in each state</span>
</span><span class="line">counts <span class="o"><-</span> count<span class="p">(</span>gtd.usa<span class="p">,</span> <span class="s">"provstate"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Merge the populations dataset with the counts dataset</span>
</span><span class="line">gtd.pop.merge <span class="o"><-</span> <span class="kp">merge</span><span class="p">(</span>counts<span class="p">,</span> populations<span class="p">,</span> by.x <span class="o">=</span> <span class="s">"provstate"</span><span class="p">,</span> by.y <span class="o">=</span> <span class="s">"Name"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Create normalized terrorism frequency by dividing frequency</span>
</span><span class="line"><span class="c1">## by the population of the state</span>
</span><span class="line">gtd.pop.merge <span class="o"><-</span> mutate<span class="p">(</span>gtd.pop.merge<span class="p">,</span> normal <span class="o">=</span> freq <span class="o">/</span> CENSUS2010POP<span class="p">)</span>
</span><span class="line">gtd.pop.merge<span class="o">$</span>normal <span class="o"><-</span> <span class="kp">log10</span><span class="p">(</span>gtd.pop.merge<span class="o">$</span>normal<span class="p">)</span>
</span><span class="line">
</span><span class="line">gtd.pop.merge<span class="o">$</span>provstate <span class="o"><-</span> <span class="kp">tolower</span><span class="p">(</span>gtd.pop.merge<span class="o">$</span>provstate<span class="p">)</span>
</span><span class="line"><span class="kp">names</span><span class="p">(</span>gtd.pop.merge<span class="p">)[</span><span class="m">1</span><span class="p">]</span> <span class="o"><-</span> <span class="s">"region"</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Load US state data</span>
</span><span class="line">states <span class="o"><-</span> map_data<span class="p">(</span><span class="s">"state"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Merge the map data with our previous dataset</span>
</span><span class="line">merged <span class="o"><-</span> <span class="kp">merge</span><span class="p">(</span>states<span class="p">,</span> gtd.pop.merge<span class="p">,</span> sort <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> by <span class="o">=</span> <span class="s">"region"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="c1">## Plot the heatmap</span>
</span><span class="line">g <span class="o"><-</span> ggplot<span class="p">(</span>merged<span class="p">)</span> <span class="o">+</span> geom_polygon<span class="p">(</span>aes<span class="p">(</span>x <span class="o">=</span> long<span class="p">,</span> y <span class="o">=</span> lat<span class="p">,</span> group <span class="o">=</span> group<span class="p">,</span>
</span><span class="line">                                       fill <span class="o">=</span> normal<span class="p">))</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> scale_fill_gradient<span class="p">(</span>low <span class="o">=</span> <span class="s">"lightgreen"</span><span class="p">,</span> high <span class="o">=</span> <span class="s">"blue"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> theme_bw<span class="p">()</span> <span class="o">+</span> labs<span class="p">(</span>fill <span class="o">=</span> <span class="s">"Normalized Frequency of Terrorism"</span><span class="p">)</span> <span class="o">+</span>
</span><span class="line">     theme<span class="p">(</span>legend.position <span class="o">=</span> <span class="s">"bottom"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> xlab<span class="p">(</span><span class="kc">NULL</span><span class="p">)</span> <span class="o">+</span> ylab<span class="p">(</span><span class="kc">NULL</span><span class="p">)</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> theme<span class="p">(</span>panel.grid.minor<span class="o">=</span>element_blank<span class="p">(),</span>
</span><span class="line">               panel.grid.major<span class="o">=</span>element_blank<span class="p">())</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> theme<span class="p">(</span>axis.text.x <span class="o">=</span> element_blank<span class="p">(),</span> axis.text.y <span class="o">=</span> element_blank<span class="p">())</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> ggtitle<span class="p">(</span><span class="s">"Normalized Frequency of Terrorism in the USA"</span><span class="p">)</span>
</span><span class="line">
</span><span class="line">g <span class="o"><-</span> g <span class="o">+</span> scale_x_continuous<span class="p">(</span>breaks <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span> <span class="o">+</span> scale_y_continuous<span class="p">(</span>breaks <span class="o">=</span> <span class="kc">NULL</span><span class="p">)</span>
</span><span class="line">
</span><span class="line">g
</span>

In order to obtain meaningful results, rather than simply plot the number of terrorist incidents per state, I divided each state’s count by the 2010 state population. I know that this is not entirely correct as population levels have fluctuated (with respect to one another) from 1970-2011 but this was fine for my purposes. I noticed some clustering in the frequencies of terrorist attacks so I took a log10 transform of those numbers to spread the numbers out more smoothly.

To leave a comment for the author, please follow the link and comment on their blog: Frank Portman.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)