Generating d3js Motion Charts from rCharts

[This article was first published on OUseful.Info, the blog... » Rstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Remember Gapminder, the animated motion chart popularised by Hans Rosling in his TED Talks and Joy of Stats TV programme? Well it’s back on TV this week in Don’t Panic – The Truth About Population, a compelling piece of OU/BBC co-produced stats theatre featuring Hans Rosling, and a Pepper’s Ghost illusion brought into the digital age courtesy of the Musion projection system:

Whilst considering what materials we could use to support the programme, we started looking for ways to make use of the Gapminder visualisation tool that makes several appearances in the show. Unfortunately, neither Gapminder (requires Java?), nor the Google motion chart equivalent of it (requires Flash?), appear to work with a certain popular brand of tablet that is widely used as a second screen device…

Looking around the web, I noticed that that Mike Bostock had produced a version of the motion chart using d3.js: The Wealth & Health of Nations. Hmmm…

Playing with that rendering on a tablet, I had a few problems when trying to highlight individual countries – the interaction interfered with an invisible date slider control – but a quick shout out to my OU colleague Pete Mitton resulted in a tweaked version of the UI with the date control moved to the side. I also added a tweak to allow specified countries to be highlighted. You can find an example here (source).

Looking at how the data was pulled into the chart, it seems to be quite a convoluted form of JSON. After banging my head against a wall for a bit, a question on Stack Overflow about how to wrangle the data from something that looked like this:

Country Region  Year    V1  V2
AAAA    XXXX    2001    12  13
BBBB    YYYY    2001    14  15
AAAA    XXXX    2002    36  56
AAAA    XXXX    1999    45  67

to something that looked like this:

[
  {"Country": "AAAA",
   "Region":"XXXX",
    "V1": [ [1999,45], [2001,12] , [2002,36] ],
    "V2":[ [1999,67], [2001,13] , [2002,56] ]
  },
  {"Country": "BBBB",
   "Region":"YYYY",
   "V1":[ [2001,14] ],
   "V2":[ [2001,15] ]
  }
]

resulted in a handy function from Ramnath Vaidyanathan that fitted the bill.

One of the reasons that I wanted to use R for the data transformation step, rather than something like Python, was that I was keen to try to get a version of the motion charts working with the rCharts library. Such is the way of the world, Ramnath is the maintainer of rCharts, and with his encouragement I had a go at getting the motion chart to work with that library, heavily cribbing from @timelyportfolio’s rCharts Extra – d3 Horizon Conversion tutorial on getting things to work with rCharts along the way.

For what it’s worth, my version of the code is posted here: rCharts_motionchart.

I put together a couple of demo’s that seem to work, including the one shown below that pulls data from the World Bank indicators API and then chucks it onto a motion chart…

To start with, here are a couple of helper functions:

require('WDI')

#A handy helper function for getting country data - this doesn't appear in the WDI package?
#---- https://code.google.com/p/google-motion-charts-with-r/source/browse/trunk/demo/WorldBank.R?r=286
getWorldBankCountries <- function(){
  require(RJSONIO)
  wbCountries <- fromJSON("http://api.worldbank.org/country?per_page=300&format=json")
  wbCountries <- data.frame(t(sapply(wbCountries[[2]], unlist)))
  wbCountries$longitude <- as.numeric(wbCountries$longitude)
  wbCountries$latitude <- as.numeric(wbCountries$latitude)
  levels(wbCountries$region.value) <- gsub("\\(all income levels\\)", "", levels(wbCountries$region.value))
  return(wbCountries)
}


#----http://stackoverflow.com/a/19729235/454773
pluck_ = function (element){
  function(x) x[[element]]
}

#' Zip two vectors
zip_ <- function(..., names = F){
  x = list(...)
  y = lapply(seq_along(x[[1]]), function(i) lapply(x, pluck_(i)))
  if (names) names(y) = seq_along(y)
  return(y)
}

#' Sort a vector based on elements at a given position
sort_ <- function(v, i = 1){
  v[sort(sapply(v, '[[', i), index.return = T)$ix]
}

library(plyr)

This next bit still needs some refactoring, and a bit of work to get it into a general form:

#I chose to have a go at putting all the motion chart parameters into a list

params=list(
  start=1950,
  end=2010,
  x='Fertility',
  y='GDP',
  radius='Population',
  color='Region',
  key='Country',
  yscale='log',
  xscale='linear',
  rmin=0,
  xmin=0

  )

##This bit needs refactoring - grab some data; the year range is pulled from the motion chart config;
##It would probably make sense to pull countries and indicators etc into the params list too?
##That way, we can start to make this block a more general function?

tmp=getWorldBankCountries()[,c('iso2Code','region.value')]
names(tmp)=c('iso2Code','Region')

data <- WDI(indicator=c('SP.DYN.TFRT.IN','SP.POP.TOTL','NY.GDP.PCAP.CD'),start = params$start, end = params$end,country=c("BD",'GB'))
names(data)=c('iso2Code','Country','Year','Fertility','Population','GDP')

data=merge(data,tmp,by='iso2Code')

#Another bit of Ramnath's magic - http://stackoverflow.com/a/19729235/454773
dat2 <- dlply(data, .(Country, Region), function(d){
  list(
    Country = d$Country[1],
    Region = d$Region[1],
    Fertility = sort_(zip_(d$Year, d$Fertility)),
    GDP = sort_(zip_(d$Year, d$GDP)),
    Population=sort_(zip_(d$Year, d$Population))
  )
})

#cat(rjson::toJSON(setNames(dat2, NULL)))

To minimise the amount of motion chart configuration, can we start to set limits based on the data values?

#This really needs refactoring/simplifying/tidying/generalising
#I'm not sure how good the range finding heuristics I'm using are, either?!
paramsTidy=function(params){
  if (!('ymin' %in% names(params))) params$ymin= signif(min(0.9*data[[params$y]]),3)
  if (!('ymax' %in% names(params))) params$ymax= signif(max(1.1*data[[params$y]]),3)
  if (!('xmin' %in% names(params))) params$xmin= signif(min(0.9*data[[params$x]]),3)
  if (!('xmax' %in% names(params))) params$xmax= signif(max(1.1*data[[params$x]]),3)
  if (!('rmin' %in% names(params))) params$rmin= signif(min(0.9*data[[params$radius]]),3)
  if (!('rmax' %in% names(params))) params$rmax= signif(max(1.1*data[[params$radius]]),3)
  params
}

params=paramsTidy(params)

This is the function that generates the rChart:

require(rCharts)

#We can probably tidy the way that the parameters are mapped...
#I wasn't sure whether to try to maintain the separation between params and rChart$params?
rChart.generator=function(params, h=400,w=800){
  rChart <- rCharts$new()
  rChart$setLib('../motionchart')
  rChart$setTemplate(script = "../motionchart/layouts/motionchart_Demo.html")

  rChart$set(
  
   countryHighlights='',
   yearMin= params$start,
   yearMax=params$end,
  
   x=params$x,
   y=params$y,
   radius=params$radius,
   color=params$color,
   key=params$key,
  
   ymin=params$ymin,
   ymax=params$ymax,
   xmin=params$xmin,
   xmax=params$xmax,
   rmin=params$rmin,
   rmax=params$rmax,
  
   xlabel=params$x,
   ylabel=params$y,
  
   yscale=params$yscale,
   xscale=params$xscale,
   
   width=w,
   height=h
 )

 rChart$set( data= rjson::toJSON(setNames(dat2, NULL)) )

 rChart
}

rChart.generator(params,w=1000,h=600)

Aside from tidying – and documenting/commenting – the code, the next thing on my to do list is to see whether I can bundle this up in a Shiny app. I made a start sketching a possible UI, but I’ve run out of time to do much more for a day or two… (I was also thinking of country checkboxes for either pulling in just that country data, or highlighting those countries.)

items=c("Fertility","GDP","Population")
names(items)=items

shinyUI(pageWithSidebar(
  headerPanel("Motion Chart demo"),
  
  sidebarPanel(
    selectInput(inputId = 'x',
                label = "X",
                choices = items,
                selected = 'Fertility'),
    selectInput(inputId = 'y',
                label = "Y",
                choices = items,
                selected = 'GDP'),
    selectInput(inputId = 'r',
                label = "Radius",
                choices = items,
                selected = 'Population')
  ),
  mainPanel(
    #The next line throws an error (a library is expected? But I don't want to use one?)
    showOutput("motionChart",'')
  )
))

As ever, we’ve quite possibly run out of time on getting much up on the OpenLearn website by Thursday to support the programme as it airs, which is partly why I’m putting this code out now… If you manage to do anything with it that would allow folk to dynamically explore a range of development indicators over the next day or two (especially GDP, fertility, mortality, average income, income distributions (this would require different visualisations?)), we may be able to give it a plug from OpenLearn, and maybe via any tweetalong campaign that’s running as the programme airs…

If you do come up with anything, please let me know via the comments, or twitter (@psychemedia)…


To leave a comment for the author, please follow the link and comment on their blog: OUseful.Info, the blog... » Rstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)