Maps are great – German Gas Prices illustrated

November 12, 2016
By

(This article was first published on Florian Teschner, and kindly contributed to R-bloggers)

One of the most appealing data visualisation charts are maps.
I love maps as they combine an incredible information density with intuitive readability.
Also I feel that most people prefer maps over other visualisations. (Is there research on this?)
So it is time to get R-map-ready.

As a play example, I downloaded all German gas stations which are next to the “Autobahn”. Along with the names, I got the exact locations (in form of latitude/longitude) and the price of gasoline at each station. (Prices are in Euro and taken on a Friday night in a time span of roughly 30 minutes.)
For starters, we just plot all gas stations on a map and color them depending on their price for (super e5) gasoline.

Germany.map = get_map(location = "Germany", zoom = 6, color="bw")  ## get MAP data
 
p <- ggmap(Germany.map)
p <- p + geom_point(data=dfff, aes(y=lat, x=lon, color=price))
p <- p +scale_color_gradient(low = "yellow", high = "red", guide=guide_legend(title = "Price"))
p  + theme(axis.title=element_blank(),
           axis.text=element_blank(),
           axis.ticks=element_blank()) + ggtitle("All Gas Stations along the Autobahn")

plot of chunk unnamed-chunk-2

The Autobahn is clearly marked by the yellow/red dots.

In a second step let’s create a density-based map. It ignores prices but takes the 2D-density of gas stations into consideration. It answers the question: where are the most gas stations per square mile?

options(stringsAsFactors=T)  ## need to run this --- weird ggplot bug=!
p <- ggmap(Germany.map)
p <- p  +  stat_density_2d(bins=30, geom='polygon', size=2, data=dfff, aes(x = lon, y = lat, alpha=..level.., fill = ..level..))
p <- p  +  scale_fill_gradient(low = "yellow", high = "red", guide=FALSE) +  scale_alpha(range = c(0.02, 0.8), guide = FALSE) +xlab("") + ylab("")
 
p  + theme(axis.title=element_blank(),
          axis.text=element_blank(),
          axis.ticks=element_blank()) + ggtitle("Gas Station Density")

plot of chunk unnamed-chunk-3

A more interesting question might be: are there regional clusters with higher prices?
In order to illustrate regional prices, we can cluster prices regionally (stat_summary_2d) and plot them as tiles on top of the map.

p <- ggmap(Germany.map)
p <- p  +  stat_summary_2d(geom = "tile",bins = 50,data=dfff, aes(x = lon, y = lat, z = price), alpha=0.5)
p <- p + scale_fill_gradient(low = "yellow", high = "red", guide = guide_legend(title = "Price")) +xlab("") + ylab("")
p  + theme(axis.title=element_blank(),
           axis.text=element_blank(),
           axis.ticks=element_blank()) + ggtitle("Gas Price Clusters")

plot of chunk unnamed-chunk-4

While that gives some insight, it feels clunky.
A better way is to cluster the stations by price (using cut2) first, then show the cluster density on individual maps (with facet_wrap).

require(Hmisc)
dfff$priceGroups <- cut2(dfff$price, g = 4)
 
p <- ggmap(Germany.map)
p <- p  +  stat_density_2d(geom = "polygon", bins = 30,data=dfff, aes(x = lon, y = lat, alpha=..level.., fill = ..level..))
p <- p+ facet_wrap(~priceGroups) + scale_fill_gradient(low = "yellow", high = "red", guide=FALSE) +  scale_alpha(range = c(0.02, 0.8), guide = FALSE) +xlab("") + ylab("")
 
p  + theme(axis.title=element_blank(),
           axis.text=element_blank(),
           axis.ticks=element_blank())  + ggtitle("Maps by Gas Price Cluster")

plot of chunk unnamed-chunk-5

I hope that short play example showed what can be (easily) done with maps in R/ggplot.

Here are the 3 take-aways:
1. use stat_density_2d(geom = "polygon", bins = 30,data=dfff, aes(x = lon, y = lat, alpha=..level.., fill = ..level..)) to plot the DENSITY of x/y coordinates on a map.
2. use stat_summary_2d(geom = "tile",bins = 50,data=dfff, aes(x = lon, y = lat, z = price) to plot the AGGREGATION of a third variable (e.g. Price) on a map
3. options(stringsAsFactors=T) needs to be set, in order for stat_density_2d(geom = "polygon" ) to work; for more “details”, see Stackoverflow.

To leave a comment for the author, please follow the link and comment on their blog: Florian Teschner.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)