Using R to study the Yemen Conflict with night light images

[This article was first published on fReigeist » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The Yemeni civil war has received very little attention despite the growing humanitarian disaster. There is a lack of reliable figures on the extent of the human suffering in Yemen. The few data that is available suggests that it is immense. According to the UN, from March 2015 to August 2016, over 10,000 people have been killed in Yemen, including 3,799 civilians.

This note asks whether high frequency satellite images do capture the extent to which conflict is ongoing in Yemen and asks in particular, whether there is distinct geographic variation suggesting which areas are most affected by the ongoing conflict.

Can the effect and the spatial incidence of the conflict in Yemen be traced through satellite images?

Satellite images have been used to study urban sprawl and general economic growth and development. The extent to which satellite images can be used to study man-made disasters such as conflicts is not widely explored.

There are lots of other papers that have used various night light data sets to study urbanization, ethnic favoritism, and economic growth (see Henderson et al, 2012 ; Michalopoulos and Papaioannou 2013,  Hodler and Raschky, 2014).

In related work Fetzer et al., 2016, I studied the extent to which light emissions in the early 1990s can be used to obtain a measure of the extent of power rationing in Colombia following El-Nino induced droughts. In another project, we use the DMSP night light images to study the evolution of cities over time and how democratization can change the relative distribution of cities Fetzer and Shanghavi, 2015.

Since 2012, the VIIRS
high frequency and high resolution satellite images capturing night lights emissions are available from NASA’s Earth Observation Group. They have now been made available for analysis on Google’s Earth Engine, making them much more accessible to the wider research audience.

Lets have a look at night light Yemen before and after the Saudi Arabian military intervention.

yemen-after

Average VIIRS lights after the Saudi intervention in Yemen started.

yemen-before

Average VIIRS lights for the period before the Saudi intervention in Yemen.

The light scales are identical, indicating that relative to the border with Saudi Arabia, the night light emissions from Yemen have dropped dramatically, especially around the capital city Sana’a. The circular blobs indicated are around the main oil/ gas producing parts of Yemen, where there may be light emissions due to flaring of natural gas.

A minimal average light emissions of 0.5 was imposed
Zooming in to Sana’a, the figures look as follows.

sanaa-after

Average light emissions from Sana’a since the Saudi intervention in Yemen started.

 

sanaa-before

Average light emissions from Sana’a for period before the Saudi intervention in Yemen.

Having a look at the data

library(data.table)
library(foreign)
library(plyr)
library(parallel)

options(stringsAsFactors = FALSE)
setwd("/Users/thiemo/Dropbox/Research/Yemen")

# A DATA SET OF 34k populated places (or historically populated places)
YE <- data.table(read.csv(file = "~/Dropbox/Research/Yemen/Yemen-Cities.csv"))

# LIGHTS DATA IS FROM VIIRS Images made availabe on the Google Earth Engine
LIGHTS <- data.table(read.csv(file = "~/Dropbox/Research/Yemen/lightsall.csv"))

LIGHTS[, `:=`(year, as.numeric(substr(system.index, 1, 4)))]
LIGHTS[, `:=`(month, as.numeric(substr(system.index, 5, 6)))]
LIGHTS[, `:=`(.geo, NULL)]
LIGHTS[, `:=`(UFI, NULL)]
LIGHTS[, `:=`(LONGITUDE, NULL)]

LIGHTS[, `:=`(date, strptime(paste(year, month, "01", sep = "-"), "%Y-%m-%d"))]

LIGHTS <- join(LIGHTS, YE)

## Joining by: rownum

Some simple plots are quite suggestive. The following plots the average light emissions around populated places over time by month. The date of the intervention onset, which coincides with the date of the fall of Sana’a coincides with dramatic drop in light emissions.

Average lights dropped by a almost 2/3, suggesting a stand still in economic activity. Overall light emissions are still visible as indicated in the graphs suggesting that the places do not turn pitch black. The

plot(LIGHTS[, mean(list), by = date], type = "l")

before-after

The Distributional Effects of the Conflict

The Houthi movement has been gaining influence over a longer time period. In particular, since the 2012 the Houthi’s have gained influence spreading from North to the South. The European Council of Foreign Relations has produced maps illustrating the spatial expansion of Houthi control in Yemen.

Yemen_Religious

A central question relates to the strategy of the Saudi military intervention. In particular, whether the intervention is aimed at territories that came under Houthi control since 2012 or whether the intervention is targeted at the Houthi-heartland.

A simple exercise that allows this study is to look at the evolution of lights in the northern Houthi-heartland relative to the populated places in the rest of the country that came under Houthi control since 2012.

A definition of what consists of the Houthi-heartland is subject to contention. But a conservative definition may consist of the four governerates Ammran, Sada’ah, Al Jawf and Hajjah.

LIGHTS[, `:=`(HOUTHI, as.numeric(ADM1 %in% c("15", "22", "21", "19")))]
require(ggplot2)
ggplot(LIGHTS[, mean(list), by = c("HOUTHI", "date")], aes(date, V1, colour = as.factor(HOUTHI))) + 
    geom_line() + geom_point() + theme_bw() + theme(legend.position = "bottom")

 

before-after-2

The summary statistics suggest that in absolute terms much larger in the non-Houthi heartland. Though given that the initial level in the Houthi heartland is much lower, suggesting that that part of the country is much less developed. Given that there is a notional minimum light emissions of zero, this truncation of the data is a concern.

One way around this is to dummify the lights measure and look at whether a populated place is lit above a certain threshold.

LIGHTS[, `:=`(anylit, list > 0.25)]
ggplot(LIGHTS[, mean(anylit), by = c("HOUTHI", "date")], aes(date, V1, colour = as.factor(HOUTHI))) + 
    geom_line() + geom_point() + theme_bw() + theme(legend.position = "bottom")

before-after-3

Again it is hard to see whether there is any divergence in trends in this dummified measure, but this naturally is less prone to be affected by the truncation inherent to this type of data.

A regression with location and time fixed effects that measures whether there was a distinct change in nightlights in places in the Houthi-heartland relative to the non-Houthi heartland suggests that there is indeed a marked difference, indicating that the conflict is concentrated in the non-Houthi heartland.

Definint the discrete variable for a difference in difference estimation and loading the lfe package that allows for high dimensional fixed effects:

LIGHTS[, `:=`(anylit, list > 0.25)]
LIGHTS[, `:=`(postKSAintervention, as.numeric(date > "2015-03-01"))]
library(lfe)
LIGHTS[, `:=`(date, as.factor(date))]

Running the actual difference in difference regressions:

# levels
summary(felm(list ~ postKSAintervention:HOUTHI | rownum + date | 0 | ADM1, data = LIGHTS))

## 
## Call:
##    felm(formula = list ~ postKSAintervention:HOUTHI | rownum + date |      0 | ADM1, data = LIGHTS) 
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -74.347  -0.205   0.043   0.194  82.063 
## 
## Coefficients:
##                            Estimate Cluster s.e. t value Pr(>|t|)  
## postKSAintervention:HOUTHI   0.4184       0.1900   2.202   0.0277 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.758 on 1172455 degrees of freedom
## Multiple R-squared(full model): 0.752   Adjusted R-squared: 0.7447 
## Multiple R-squared(proj model): 0.003315   Adjusted R-squared: -0.02603 
## F-statistic(full model, *iid*):  103 on 34519 and 1172455 DF, p-value: < 2.2e-16 
## F-statistic(proj model): 4.848 on 1 and 22 DF, p-value: 0.03846
# dummified measure
summary(felm(anylit ~ postKSAintervention:HOUTHI | rownum + date | 0 | ADM1, 
    data = LIGHTS))

## 
## Call:
##    felm(formula = anylit ~ postKSAintervention:HOUTHI | rownum +      date | 0 | ADM1, data = LIGHTS) 
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.12247 -0.10416  0.00593  0.06185  1.06958 
## 
## Coefficients:
##                            Estimate Cluster s.e. t value Pr(>|t|)    
## postKSAintervention:HOUTHI  0.08470      0.02359    3.59  0.00033 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2223 on 1172455 degrees of freedom
## Multiple R-squared(full model): 0.5762   Adjusted R-squared: 0.5637 
## Multiple R-squared(proj model): 0.008458   Adjusted R-squared: -0.02073 
## F-statistic(full model, *iid*):46.18 on 34519 and 1172455 DF, p-value: < 2.2e-16 
## F-statistic(proj model): 12.89 on 1 and 22 DF, p-value: 0.00163
# taking logs
summary(felm(log(list) ~ postKSAintervention:HOUTHI | rownum + date | 0 | ADM1, 
    data = LIGHTS[!is.infinite(log(list))]))

## 
## Call:
##    felm(formula = log(list) ~ postKSAintervention:HOUTHI | rownum +      date | 0 | ADM1, data = LIGHTS[!is.infinite(log(list))]) 
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.8918  -0.3725   0.1060   0.5223   6.5958 
## 
## Coefficients:
##                            Estimate Cluster s.e. t value Pr(>|t|)    
## postKSAintervention:HOUTHI   0.4133       0.1234    3.35 0.000809 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8958 on 844476 degrees of freedom
##   (327294 observations deleted due to missingness)
## Multiple R-squared(full model): 0.6534   Adjusted R-squared: 0.6393 
## Multiple R-squared(proj model): 0.01248   Adjusted R-squared: -0.02789 
## F-statistic(full model, *iid*):46.12 on 34519 and 844476 DF, p-value: < 2.2e-16 
## F-statistic(proj model): 11.22 on 1 and 22 DF, p-value: 0.002899

An alternative way to study this is by doing a flexible non-parametric estimation to rule out diverging trends prior to the military intervention.

summary(felm(anylit ~ date:HOUTHI | rownum + date | 0 | ADM1, data = LIGHTS))

## 
## Call:
##    felm(formula = anylit ~ date:HOUTHI | rownum + date | 0 | ADM1,      data = LIGHTS) 
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.12574 -0.10765  0.00313  0.06437  1.06515 
## 
## Coefficients:
##                       Estimate Cluster s.e. t value Pr(>|t|)    
## date2014-01-01:HOUTHI       NA      0.00000      NA       NA    
## date2014-02-01:HOUTHI  0.01095      0.01320   0.830 0.406641    
## date2014-03-01:HOUTHI  0.03173      0.02764   1.148 0.250884    
## date2014-04-01:HOUTHI  0.11048      0.06028   1.833 0.066814 .  
## date2014-05-01:HOUTHI  0.09762      0.05271   1.852 0.063989 .  
## date2014-06-01:HOUTHI  0.10249      0.05861   1.749 0.080336 .  
## date2014-07-01:HOUTHI  0.07204      0.06053   1.190 0.233987    
## date2014-08-01:HOUTHI  0.06338      0.04866   1.302 0.192778    
## date2014-09-01:HOUTHI  0.03816      0.04690   0.814 0.415860    
## date2014-10-01:HOUTHI  0.04247      0.04359   0.974 0.329930    
## date2014-11-01:HOUTHI  0.05621      0.03646   1.542 0.123115    
## date2014-12-01:HOUTHI  0.02213      0.03037   0.729 0.466205    
## date2015-01-01:HOUTHI -0.02596      0.02585  -1.004 0.315415    
## date2015-02-01:HOUTHI  0.02250      0.05141   0.438 0.661649    
## date2015-03-01:HOUTHI  0.06080      0.05740   1.059 0.289437    
## date2015-04-01:HOUTHI  0.13514      0.04806   2.812 0.004925 ** 
## date2015-05-01:HOUTHI  0.15874      0.04647   3.416 0.000635 ***
## date2015-06-01:HOUTHI  0.15493      0.05151   3.008 0.002632 ** 
## date2015-07-01:HOUTHI  0.12681      0.04697   2.700 0.006944 ** 
## date2015-08-01:HOUTHI  0.12363      0.04319   2.863 0.004202 ** 
## date2015-09-01:HOUTHI  0.13972      0.05276   2.648 0.008088 ** 
## date2015-10-01:HOUTHI  0.13422      0.04697   2.857 0.004273 ** 
## date2015-11-01:HOUTHI  0.12408      0.04566   2.717 0.006578 ** 
## date2015-12-01:HOUTHI  0.12125      0.04505   2.691 0.007119 ** 
## date2016-01-01:HOUTHI  0.11971      0.03905   3.065 0.002176 ** 
## date2016-02-01:HOUTHI  0.11952      0.04151   2.879 0.003984 ** 
## date2016-03-01:HOUTHI  0.12721      0.04239   3.001 0.002693 ** 
## date2016-04-01:HOUTHI  0.12537      0.04532   2.766 0.005669 ** 
## date2016-05-01:HOUTHI  0.12989      0.05297   2.452 0.014209 *  
## date2016-06-01:HOUTHI  0.13070      0.05936   2.202 0.027675 *  
## date2016-07-01:HOUTHI  0.14831      0.06597   2.248 0.024573 *  
## date2016-08-01:HOUTHI  0.13047      0.04614   2.827 0.004693 ** 
## date2016-09-01:HOUTHI  0.14481      0.06024   2.404 0.016227 *  
## date2016-10-01:HOUTHI  0.11782      0.05255   2.242 0.024959 *  
## date2016-11-01:HOUTHI  0.12175      0.04473   2.722 0.006486 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2219 on 1172422 degrees of freedom
## Multiple R-squared(full model): 0.5776   Adjusted R-squared: 0.5652 
## Multiple R-squared(proj model): 0.01175   Adjusted R-squared: -0.01738 
## F-statistic(full model, *iid*): 46.4 on 34552 and 1172422 DF, p-value: < 2.2e-16 
## F-statistic(proj model): 147.2 on 35 and 22 DF, p-value: < 2.2e-16

This suggests that the differential drop in lights occured only after March 2015, the month in which Saudi Arabia’s military intervention commenced.

On average, the regressions suggest that the drop in lights was significantly more pronounced outside the Houthi heartland. This suggests that the conflict and the bombing carried out by Saudi Arabia is mostly concentrated outside the Houthi rebel heartland.

That the dramatic drops in light emissions is associated with the Saudi military intervention is quite clear. The conflict between the Houthi rebels and the government had been ongoing for several years but only starting with the intervention of Saudi Arabia do marked differences between Houthi and non-Houthi heartland provinces appear.

This analysis can further be refined by studying the role of the religious make up of different provinces, as the role of the religious make up between Shia and Sunni muslim groups is said to be an important factor driving this conflict.

Nevertheless, this analysis suggests that high frequency satellite images such as these can be useful in assessing the extent to which areas area directly affected by conflict, which may be useful for targeting humanitarian relief.

To leave a comment for the author, please follow the link and comment on their blog: fReigeist » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)