Pacific Island Hopping using R and iGraph

[This article was first published on The Devil is in the Data, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last month I enjoyed a relaxing holiday in the tropical paradise of Vanuatu. One rainy day I contemplated how to go island hopping across the Pacific ocean visiting as many island nations as possible. The Pacific ocean is a massive body of water between, Asia and the Americas, which covers almost half the surface of the earth. The southern Pacific is strewn with island nations from Australia to Chile. In this post, I describe how to use R to plan your next Pacific island hopping journey.

Pacific Island hopping

The Pacific Ocean.

Listing all airports

My first step was to create a list of flight connections between each of the island nations in the Pacific ocean. I am not aware of a publically available data set of international flights so unfortunately, I created a list manually (if you do know of such data set, then please leave a comment).

My manual research resulted in a list of international flights from or to island airports. This list might not be complete, but it is a start. My Pinterest board with Pacific island airline route maps was the information source for this list.

The first code section reads the list of airline routes and uses the ggmap package to extract their coordinates from Google maps. The data frame with airport coordinates is saved for future reference to avoid repeatedly pinging Google for the same information.

# Init
library(tidyverse)
library(ggmap)
library(ggrepel)
library(geosphere)

# Read flight list and airport list
flights <- read.csv("Geography/PacificFlights.csv", stringsAsFactors = FALSE)
f <- "Geography/airports.csv"
if (file.exists(f)) {
    airports <- read.csv(f)
    } else airports <- data.frame(airport = NA, lat = NA, lon = NA)

# Lookup coordinates for new airports
all_airports <- unique(c(flights$From, flights$To))
new_airports <- all_airports[!(all_airports %in% airports$airport)]
if (length(new_airports) != 0) {
    coords <- geocode(new_airports)
    new_airports <- data.frame(airport = new_airports, coords)
    airports <- rbind(airports, new_airports)
    airports <- subset(airports, !is.na(airport))
    write.csv(airports, "Geography/airports.csv", row.names = FALSE)
}

# Add coordinates to flight list
flights <- merge(flights, airports, by.x="From", by.y="airport")
flights <- merge(flights, airports, by.x="To", by.y="airport")

Create the map

To create a map, I modified the code to create flight maps I published in an earlier post. This code had to be changed to centre the map on the Pacific. Mapping the Pacific ocean is problematic because the -180 and +180 degree meridians meet around the date line. Longitudes west of the antemeridian are positive, while longitudes east are negative.

The world2 data set in the borders function of the ggplot2 package is centred on the Pacific ocean. To enable plotting on this map, all negative longitudes are made positive by adding 360 degrees to them.

# Pacific centric
flights$lon.x[flights$lon.x < 0] <- flights$lon.x[flights$lon.x < 0] + 360
flights$lon.y[flights$lon.y < 0] <- flights$lon.y[flights$lon.y < 0] + 360
airports$lon[airports$lon < 0] <- airports$lon[airports$lon < 0] + 360

# Plot flight routes
worldmap <- borders("world2", colour="#efede1", fill="#efede1")
ggplot() + worldmap + 
    geom_point(data=airports, aes(x = lon, y = lat), col = "#970027") + 
    geom_text_repel(data=airports, aes(x = lon, y = lat, label = airport), 
      col = "black", size = 2, segment.color = NA) + 
    geom_curve(data=flights, aes(x = lon.x, y = lat.x, xend = lon.y, 
      yend = lat.y, col = Airline), size = .4, curvature = .2) + 
    theme(panel.background = element_rect(fill="white"), 
          axis.line = element_blank(),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank(),
          axis.title.x = element_blank(),
          axis.title.y = element_blank()
          ) + 
    xlim(100, 300) + ylim(-40,40)

Pacific island hopping

Pacific Island Hopping

This visualisation is aesthetic and full of context, but it is not the best visualisation to solve the travel problem. This map can also be expressed as a graph with nodes (airports) and edges (routes). Once the map is represented mathematically, we can generate travel routes and begin our Pacific Island hopping.

The igraph package converts the flight list to a graph that can be analysed and plotted. The shortest_path function can then be used to plan routes. If I would want to travel from Auckland to Saipan in the Northern Mariana Islands, I have to go through Port Vila, Honiara, Port Moresby, Chuuk, Guam and then to Saipan. I am pretty sure there are quicker ways to get there, but this would be an exciting journey through the Pacific.

library(igraph)
g <- graph_from_edgelist(as.matrix(flights[,1:2]), directed = FALSE)
par(mar = rep(0, 4))
plot(g, layout = layout.fruchterman.reingold, vertex.size=0)
V(g)
shortest_paths(g, "Auckland", "Saipan")

View the latest version of this code on GitHub.

Pacific Flight network

The post Pacific Island Hopping using R and iGraph appeared first on The Devil is in the Data.

To leave a comment for the author, please follow the link and comment on their blog: The Devil is in the Data.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)