East-West Divide

[This article was first published on R | Quantum Jitter, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


cols <- wes_palette(8, name = "FantasticFox1", type = "continuous")

When tensions heightened at the United Nations early in 2019, I wondered whether we had drawn closer, or farther apart, over the decades since the body was established in 1945.

I’ll see if I can garner a clue by performing cluster analysis on the General Assembly voting of five of the founding members. I’ll focus on the five permanent members (P5) of the Security Council so I can later see if Security Council vetoes corroborate the findings.

I’ll refer to the five simply as China, France, Russia, the UK, and the US rather than, for example, the Russian Federation, or formerly the USSR.

General Assembly

The unvotes package provides the voting history of the General Assembly; the only organ of the United Nations in which all 193 member states have equal representation.

un_df <- un_votes %>%
  inner_join(un_roll_calls, by = "rcid") %>%
  filter(country_code %in% c("GB", "CN", "US", "FR", "RU")) %>%
    country = recode(
      GB = "UK",
      CN = "China",
      FR = "France",
      RU = "Russia"
    date = ymd(date)
tidy_df <- un_df %>%
  distinct(country, rcid, vote) %>%
  tidyr::complete(country, nesting(rcid)) %>%
    vote = replace_na(as.character(vote), "na"),
    rcid_vote = str_c(rcid, vote), value = 1
  ) %>%
  group_by(rcid) %>% 
  mutate(variation = n_distinct(vote)) %>% 
  ungroup() %>% 
  filter(variation != 1) 

wide_df <- tidy_df %>% 
  pivot_wider(id_cols = country, names_from = rcid_vote, values_from = value, values_fill = 0) %>% 
  column_to_rownames(var = "country")

K-means clustering is one of the most commonly used unsupervised machine learning algorithms for partitioning data into clusters. It requires the number of clusters to be pre-determined, and seeks to minimise the total within-cluster variation. The question concerns East and West, so I’ll look for two clusters.

Because we have many more than two variables, fviz_cluster uses the Principal Component Analysis algorithm to reduce the dimensions. Two new variables form the x and y axis, in this case representing over 70% of the original variables. This allows the visualisation in two dimensions of the relative distance between the two clusters.

So, there is a distance between East and West over the full history of voting, with France, the UK and the US together in one cluster. China and Russia form the other.


kclust <- wide_df %>% 
  kmeans(2, nstart = 30)

from <- min(year(un_df$date))
to <- max(year(un_df$date))  

kclust %>%
    data = wide_df[, -1],
    palette = cols[c(8, 5)],
    labelsize = 10,
    ggtheme = theme_bw()
  ) +
  labs(title = glue("P5 Distance Over the Period {from} to {to}"), 
       subtitle = "K-means Clusters", caption = "Source: unvotes")

But this represents the average clustering over 70 years of UN voting. Has the distance changed over time? I’ll divide the voting into two equal parts to assess the change.

rcid_df <- tidy_df %>%
  mutate(era = if_else(rcid < median(rcid), "early", "late"),
         ctry_era = str_c(country, "_", era)) %>%
  pivot_wider(id_cols = ctry_era, names_from = rcid_vote, 
              values_from = value, values_fill = 0) %>%
  column_to_rownames(var = "ctry_era")

Now I have 10 observations: five for the early years by country, and five for the later years. So now let’s see if we’re getting closer or farther apart. This time I’ll model 4 clusters to see if this emerges as East and West in both the early and late eras.

What do we see?


era_clust <- rcid_df %>% 
  kmeans(4, nstart = 30)

era_clust %>%
    data = rcid_df[, -1],
    repel = TRUE,
    palette = cols[c(8, 6, 4, 1)],
    labelsize = 10,
    ggtheme = theme_bw()
  ) +
  labs(title = glue("A Tale of Two Eras {from} to {to}"), 
       subtitle = "K-means Clusters", caption = "Source: unvotes")

For the first half of the roll-calls, France, the UK and the US formed one cluster, whilst Russia was at some distance.

Although the Republic of China (ROC) joined the UN at its founding in 1945, it’s worth noting that the People’s Republic of China (PRC), commonly called China today, was admitted into the UN in 1971. Hence its greater distance in the clustering for the early years.

For the second half of the roll-calls, France and the UK remain close. Not surprising given our EU ties. Will Brexit have an impact going forward? The US is slightly separated from its European allies, but what is perhaps of greater note, is the shorter distance between the Western three and Russia & China. Will globalisation continue to bring us closer together, or is the tide about to turn?

Security Council Vetoes

The above analysis has focused on General Assembly voting. By web-scraping the UN’s Security Council Veto List, we can acquire further insights on the voting patterns of the P5.

url <- "https://www.un.org/depts/dhl/resguide/scact_veto_table_en.htm"

meeting_df <- url %>% 
  read_html() %>%
  html_node(".tablefont") %>% 
  html_table(fill = TRUE) %>% 
  select(date = 1, draft = 2, meeting = 3, agenda = 4, vetoed_by = 5) %>% 
meeting_df2 <- meeting_df %>%
    date = str_remove(date, "-" %R% dgt(2)),
    date = dmy(date),
    date = if_else(date == ymd("0086-01-30"), ymd("1986-01-30"), date),
    vetoed_by = str_replace(vetoed_by, "USSR", "Russia"),
    Russia = if_else(str_detect(vetoed_by, "Russia"), 1, 0),
    China = if_else(str_detect(vetoed_by, "China"), 1, 0),
    France = if_else(str_detect(vetoed_by, "France"), 1, 0),
    US = if_else(str_detect(vetoed_by, "US"), 1, 0),
    UK = if_else(str_detect(vetoed_by, "UK"), 1, 0)
    ) %>% 
  pivot_longer(c(Russia:UK), names_to = "country", values_to = "veto") %>% 
  filter(veto == 1)

country_df <- meeting_df2 %>%
  count(country) %>% 
  mutate(country = fct_reorder(country, n))

Interestingly, Russia dominated the early vetoes before these dissipated in the late 60s. Vetoes picked up again in the 70s with the US dominating through to the 80s. And there would certainly appear to be less dividing us since the 90s.

Do the vetoes since 2015 suggest a turning of the tide?

little_plot <- country_df %>% 
  ggplot(aes(country, n, fill = country)) +
  geom_col() +
  coord_flip() +
  scale_fill_manual(values = cols[c(2, 3, 4, 6, 8)]) +
  geom_label(aes(label = n), colour = "white", hjust = "inward") +
  labs(x = NULL, y = NULL, fill = NULL, title = "Most Vetoes",
       caption = "Source: research.un.org")

year_df <- meeting_df2 %>%
  mutate(year = year(date)) %>% 
  count(year, country)

as_at <- format(today(), "%b %d, %Y")
big_plot <- year_df %>% 
  ggplot(aes(year, n, fill = country)) +
  geom_col(show.legend = FALSE) +
  scale_fill_manual(values = cols[c(2, 3, 4, 6, 8)]) +
  scale_x_continuous(breaks = (seq(1945, 2020, 5))) +
  labs(x = NULL, y = "Veto Count", fill = NULL,
       title = "A Turning Tide?",
       subtitle = glue("Security Council Vetoes as at {as_at}")) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

layout <- "AAB"
big_plot + little_plot + plot_layout(design = layout)

R Toolbox

Summarising below the packages and functions used in this post enables me to separately create a toolbox visualisation summarising the usage of packages and functions across all posts.

Package Function
base library[11]; c[3]; set.seed[2]; as.character[1]; conflicts[1]; cumsum[1]; format[1]; function[1]; max[1]; min[1]; search[1]; seq[1]; sum[1]
dplyr mutate[11]; if_else[10]; filter[7]; count[2]; group_by[2]; n[2]; select[2]; tibble[2]; arrange[1]; as_tibble[1]; desc[1]; distinct[1]; inner_join[1]; n_distinct[1]; recode[1]; slice[1]; summarise[1]; ungroup[1]
factoextra fviz_cluster[2]
forcats fct_reorder[1]
ggplot2 labs[4]; aes[3]; theme_bw[3]; geom_col[2]; ggplot[2]; scale_fill_manual[2]; coord_flip[1]; element_text[1]; geom_label[1]; scale_x_continuous[1]; theme[1]; theme_set[1]
glue glue[4]
graphics layout[1]
kableExtra kable[1]
lubridate date[4]; year[3]; ymd[3]; dmy[1]; today[1]
patchwork plot_layout[1]
purrr map[1]; map2_dfr[1]; possibly[1]; set_names[1]
readr read_lines[1]
rebus literal[4]; lookahead[3]; whole_word[2]; ALPHA[1]; dgt[1]; lookbehind[1]; one_or_more[1]; or[1]
rvest html_node[1]; html_table[1]
stats kmeans[2]; median[1]
stringr str_detect[8]; str_c[4]; str_remove[3]; str_count[1]; str_remove_all[1]; str_replace[1]
tibble column_to_rownames[2]; enframe[1]
tidyr pivot_wider[2]; tibble[2]; as_tibble[1]; nesting[1]; pivot_longer[1]; replace_na[1]; unnest[1]
wesanderson wes_palette[1]
xml2 read_html[1]

To leave a comment for the author, please follow the link and comment on their blog: R | Quantum Jitter.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)