Whenever I participate in a Science Slam, I try to work in an analysis of something
typical for the respective city. My next gig will be in Munich, so there are two natural
options: beer or football. In the end I choose both, but here I will focus on the former.
#used packages
library(tidyverse) # for data wrangling
library(TSP) #solving Traveling Salesman problems
library(ggmap) #maps in ggplot2
library(leaflet) #interactive maps
Data
Munich is, among other things of course, famous for its beergardens.
If you are not German enough to know what a beergarden is, well, it is simply an outdoor
area in which beer and some local food are served. The concept in fact originated in Munich,
where you can find an enormous amount of them. I found this page,
which lists 114 in and around Munich! I scraped their list and added Longitude/Latitude as well
as the brewery that provides the beer. You can find the data as a csv file on github.
# df_beer < read_csv("beergarden.csv")
glimpse(df_beer)
## Observations: 114
## Variables: 4
## $ Name "AugustinerKeller", "Aujäger", "Alte Villa", "Alter...
## $ Biermarke "Augustiner", "Hacker Pschorr", "Kaltenberger", "Aug...
## $ Long 11.55122, 11.44961, 11.09812, 11.41751, 11.20066, 11...
## $ Lat 48.14374, 47.91929, 48.02635, 48.09848, 48.08622, 48...
We can plot the location of the beergardens using the ggmap
package.
### Get a map with Marienplatz as center
map < get_map(location = c(11.5773133,48.1382570), zoom = 9,
maptype = "roadmap",source="google",color="bw")
ggmap(map)+
geom_point(data=df_beer,aes(x=Long,y=Lat),size=2,alpha=0.75,col="goldenrod3")+
scale_x_continuous(limits=c(11.01,11.99))+
scale_y_continuous(limits=c(47.740,48.46))+
theme(legend.position="bottom",
text=element_text(size=16),
aspect.ratio = .9,
axis.ticks = element_blank(),
axis.text = element_blank())+
labs(x="",y="")
In the next section, I briefly explain what we are going to do with the data. If you are
already familiar with the traveling salesman problem, which in our case becomes the
traveling beerdrinker problem, you can safely skip that part.
Traveling Salesman Problem
What we are going to look at now is the so called Traveling salesman problem. The problem
statement is as follows:
“Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?”
The problem was formulated in 1930 and is one of the most studied problems in optimization,
used as a benchmark for many optimization methods. Solving this problem, however, is hard. In fact, it is NP hard. But in spite of computational difficulties,
a large number of heuristics and exact algorithms exists to solve the problem. Instances with tens of thousands of cities can be solved exactly and even problems with millions of cities can be approximated well.
The method was also used by Randal Olson to create a round trip of US National Parks.
Solving the Traveling Beerdrinker Problem
To solve the traveling beerdrinker problem, we first need the distances between the
beergardens. You can use the euclidean distances if you are a flat earther, but
if you do not believe in wacky ideas, you may want to use the distGeo()
function from the
geosphere
package^{1}.
n < nrow(df_beer)
beer_dist < matrix(0,n,n)
for(i in 1:n){
for(j in 1:n){
beer_dist[i,j] < geosphere::distGeo(c(df_beer$Long[i],df_beer$Lat[i]),
c(df_beer$Long[j],df_beer$Lat[j]))
}
}
With the distances at hand, we can now solve the problem with the TSP
package.
In order to use the best possible algorithm, you need to download concorde and
linkern executables for your OS from here.
Use concorde_path()
to let R know where the executables are located.
You can also simply use the default method since our problem is rather small.
beer_tsp < TSP(beer_dist,labels=df_beer$name)
beer_route < solve_TSP(beer_tsp,method="concorde")
#default:
#beer_route < solve_TSP(beer_tsp,control=list(repetitions=100,two_opt=TRUE))
The function tour_length()
tells us, how long the tour is. Our beergarden tour for instance
is 437.21 km long (For my American friends: That should be around 271.67 miles). Let’s put it on a map.
#turn tour into integer sequence for plotting
beer_route < as.integer(beer_route)
beer_route < c(beer_route,beer_route[1])
ggmap(map)+
geom_path(data=df_beer[beer_route,],aes(x=Long,y=Lat))+
geom_point(data=df_beer,aes(x=Long,y=Lat),size=2,col="goldenrod3")+
scale_x_continuous(limits=c(11.01,11.99))+
scale_y_continuous(limits=c(47.740,48.46))+
theme(legend.position="bottom",
text=element_text(size=16),
aspect.ratio = 0.9,
axis.ticks = element_blank(),
axis.text = element_blank())+
labs(x="",y="")
If you fancy interactive maps, you can use the leaflet
package.
m1 < df_beer %>%
mutate(popup_text=paste(sep = "
", paste0("", Name, ""), Biermarke)) %>%
leaflet() %>%
addTiles() %>%
setView(11.5773133,48.1382570, zoom = 10) %>%
addPolylines(data=df_beer[beer_route,],lat=~Lat,lng=~Long) %>%
addCircleMarkers(lng=~Long,
lat=~Lat,
popup = ~popup_text,
radius=4,
color = "black",
stroke=FALSE,
fillOpacity = 0.7)
m1
Clicking on a beergarden will give you its name and the brewery that supplies it.
More than 400km (240 miles) is certainly not something that can be done on a single day.
We reduce the problem now a bit and just look at the beergardens that are not
further than 15km (9.32 miles) away from the Marienplatz.
center < c(11.5773133,48.1382570)
center_dist < rep(0,n)
for(i in 1:n){
center_dist[i] < geosphere::distGeo(c(df_beer$Long[i],df_beer$Lat[i]),
center)
}
#Beergarden less than 15km from Marienplatz
idx < which(center_dist<=15000)
beer_tsp < TSP(beer_dist[idx,idx],labels=df_beer$Name[idx])
beer_route_small < solve_TSP(beer_tsp,method="concorde")
This gives us a tour of 77 beergarden with 168.94 km
(104.98 miles). Doable in a day I’d say.
Yet, I would not recommend to drink a Weißbier at every single beergarden. That may well kill you.
The interactive map below shows the route.
beer_route_small < as.integer(beer_route_small)
beer_route_small < c(beer_route_small,beer_route_small[1])
m2 < df_beer %>%
mutate(popup_text=paste(sep = "
", paste0("", Name, ""), Biermarke)) %>%
leaflet() %>%
addTiles() %>%
setView(11.5773133,48.1382570, zoom = 10) %>%
addPolylines(data=df_beer[idx[beer_route_small],],lat=~Lat,lng=~Long,col="red") %>%
addCircleMarkers(lng=~Long,
lat=~Lat,
popup=~popup_text,
radius=4,
color = "black",
stroke=FALSE,
fillOpacity = 0.7)
m2

Technically this is also not entirely correct since we are calculating distances
“as the crow flies” and not using road connections.↩
Rbloggers.com offers daily email updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...