Site icon R-bloggers

Optimal Meeting Point on the Paris Metro

[This article was first published on jean-robert.github.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

tl;dr: Play with the app here

When you live in Paris, chances are you are (home or work) very close to a metro station, so when you want to meet with some friends, you usually end up picking another metro station as a meeting point. Yet, finding the optimal place to meet can easily become a complex problem considering the dense network we have. Now that the RATP (the public transport operator in Paris) had made some of their datasets available, this sounds like a good job to be solved with R and Shiny.

In the spirit of the current open data movement, the RATP has made available a number of datasets under the Etalab license, and among them, two are of a particular interest for us:

arret_ligne <- read.csv("ratp_arret_ligne.csv", header=F, sep="#",
	    col.names=c("ID","Ligne","Type"),
	    stringsAsFactors=F)

arret_positions <- read.csv("ratp_arret_graphique.csv", header=F, sep="#",
		col.names=c("ID","X","Y","Nom","Ville","Type"),
		stringsAsFactors=F)

To state our problem more clearly, we are given initially a set of n metro stops among all N possible, and we want to find S the optimal stop where to meet. A first step will involve computing the distances among all metro stops (shortest path, preferably on a time scale rather than a space scale!), and the second step is to find some kind of “barycenter” of these n stops. For these purposes, we model our metro network as a graph. The shortest path among two stops can be found using the very common Dijkstra algorithm, while defining the “barycenter” can be a bit cumbersome. Using a geographic barycenter doesn’t make any sense (we might end up in a place with no stop, or even with the closest stop being physically far away from a duration perspective). The next thing could be to think of this problem as finding the centroid of the cluster formed by our n stops, using something in the spirit of k-means (which doesn’t need actual points in space but only distances), and mapping this centroid to our larger network, but empirically the results didn’t look sound. No point in complicating things, another way to think of this is merely as a minimax problem: finding the stop which minimizes the maximum distance from each of the n initial stops to S. And this is actually very easy to implement in R!

There are two technical problems worth highlighting:

  1. A stopping time (train slows/accelerates, doors open/close, people get in/out), about 1 minute
  2. A connection time (only when changing lines), about 5 minutes
  3. The actual transportation time (physically between A and B), proportional to the geographical distance from A to B, about 100 minute per degree. Since stops are referenced with longitude and latitude, the physical distance between the two is in degrees… The calibration of the model has been done manually with some trial and error, it is a rough estimate and not perfect at all, but that does the job.

So here we are, equipped with our distance matrix D between any two stops of the RATP metro network, ready to identify the optimal stop to meet when people come from, say, Barbès-Rochechouart, Bastille, Dupleix, and Pernety.

sources <- getIDFromStation(c("Barbès-Rochechouart", "Bastille", "Dupleix", "Pernéty"))
target <- names(which.min(apply(D[,format(sources)], 1, max)))
getStationFromID(target)
# "Odéon"

That’s actually great because Odéon is a cool place to get together and have a drink. Thinking about it, it would actually be useful to integrate a penalty for metro stops without any nice bar around…
In case you have a similar problem one of these days, consider using that page, where a Shiny version of the app is available, along with a ggplot2 chart. I know this should be done in D3 for more interactivity, but that’s on my todo for sure!

To leave a comment for the author, please follow the link and comment on their blog: jean-robert.github.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.