[This article was first published on alexfarquhar's posterous, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve been playing around with the rgl package in the last week, as part of an ongoing quest to come up with nice-looking (but more importantly, useful) data vizualisations. It’s a nice little package, and once you’ve run through the excellent examples, you can rapidly create some cool stuff. The example that initially caught my eye was this one, which creates a 3D plot of the ‘volcano’ dataset in only a few lines of R code:

data(volcano) y <- 2 * volcano # Exaggerate the relief x <- 10 * (1:nrow(y)) # 10 meter spacing (S to N) z <- 10 * (1:ncol(y)) # 10 meter spacing (E to W) ylim <- range(y) ylen <- ylim[2] - ylim[1] + 1 colorlut <- terrain.colors(ylen) # height color lookup table col <- colorlut[ y-ylim[1]+1 ] # assign colors to heights rgl.open() rgl.surface(x, z, y, color=col, back="lines")

What you can see here is the literal mapping of geographical height - the volcano dataset is a set of height measurements over a 10m x 10m grid of a volcano in New Zealand. What you can't see from this image is that the visualization is zoomable and rotatable (if that's a real word). The power of this is hard to describe until you've actually used it to fly through a dataset. The obvious extension of this is to map something other than physical height on the vertical axis. It could be anything - pollen counts, crime figures etc. So let see how we can do that...

Firstly, of course, you'll need to have R installed, along with the rgl package, which you can install by running:

install.packages('rgl')

And secondly, you'll need some data. I've prepared a sample (fake) dataset along with all the code from this post, you can find it here. It consists of 3 columns, 'latitude', 'longitude' and 'calls'. The first 2 columns represent a location, and the third column represents the count of mobile phone calls at the location. Let's begin by reading the data into a dataframe. I'll be using some of the awesome reshape package as well, so we'll need to load that along with rgl:

library(rgl) library(reshape) rgl.clear(type = c("shapes")) calls = read.delim('call_data.tsv', header = T)

Now that we've loaded the data, we hit our first problem. The volcano example above uses the rgl.surface function, which takes a matrix argument (not a dataframe sadly). What's more, in the example, the volcano data has already been nicely arranged into a grid, with each cell representing the average height of a 10m x 10m grid square. By contrast, our calls dataset has arbitrary lat/long pairs. So (as ever) we'll need to do some manipulation of our data before we can plot it.What we need to do is bucket our call counts into evenly spaced divisions, in order that we can create a grid representing the geographical layout. Fortunately, R has full support for this kind of binning, using the cut function:

bin_size = 0.18 calls$long_bin = cut(calls$long, seq(min(calls$long), max(calls$long), bin_size)) calls$lat_bin = cut(calls$lat, seq(min(calls$lat), max(calls$lat), bin_size)) calls$total = log(calls$total) / 3 #flatten out totals

So now we've created a grid system, with each row in the dataset falling into a 0.18 x 0.18 degree grid square (I chose 0.18 for the most important reason - its makes the visualization look better :-)). Next we have to sum up all the call counts which are in the same lat/long bucket. Thankfully we have the splendid reshape library to help us here:

calls = melt(calls[,3:5]) calls = cast(calls, lat_bin~long_bin, fun = sum, fill = 0) calls = calls[,2:(ncol(calls)-1)] calls = as.matrix(calls)

Nearly there! We've now got our matrix, so we need to define the x, y, and z data for the rgl.surface function (run help(rgl.surface) for more details), and then call it:

x = (1: nrow(calls)) z = (1: ncol(calls)) rgl.surface(x, z, calls) rgl.bringtotop()

And here's the result:

Cool - a 3D representation of the number of phone calls across the US. But that's not all! We can add colors to show the different counts more clearly. Firstly we'll clear the previous plot by using rgl.pop(), then create a color vector from the data:

rgl.pop() # nicer colored plot ylim <- range(calls) ylen <- ylim[2] - ylim[1] + 1 col <- topo.colors(ylen)[ calls-ylim[1]+1 ] x = (1: nrow(calls)) z = (1: ncol(calls)) rgl.bg(sphere=FALSE, color=c("black"), lit=FALSE) rgl.viewpoint( theta = 300, phi = 30, fov = 170, zoom = 0.03) rgl.surface(x, z, calls, color = col, shininess = 10) rgl.bringtotop()

Nice - and there you have it, genuine 3D mapping in about 30 lines of R. Next time I may make some videos, so watch this space 😉