Sankey diagrams are great for visualising flows from one set of data values to another. Although named after Irish Captain Matthew Henry Phineas Riall Sankey, who used this type of diagram in 1898 to show the energy efficiency of a steam engine, the best know Sankey diagram is probably Charles Minard‘s Map of Napoleon’s Russian Campaign of 1812, which he actually produced in 1869.
|Thomas Rahlf: Datendesign mit R|
The above example from Thomas Rahlf’s book Datendesign mit R shows that Minard’s plot can be reproduced with base graphics in R. Aaron Berdanier posted in 2010 the SankeyR function and Erik Andrulis published the riverplot package on CRAN that allows users to create static Sankey charts as well.
Interactive Sankey diagram can be generated with rCharts and now also with googleVis (version >= 0.5.0). For my a first example I use UK visitor data from VisitBritain.org. The following diagram visualises the flow of visitors in 2012; where they came from and which parts of the UK they visited. This example illustrates the key concept already. I need a data frame with three columns that explains the flow of data from a source to a target and the strength or weight of the connection.
My next example uses a graph data set that I visualise in the same way again, but here I start to play around with the various parameters of the Google API.
As stated by Google, the Sankey chart may be undergoing substantial revisions in future Google Charts releases.
For more information and installation instructions see the googleVis project site and Google documentation.
R version 3.0.3 (2014-03-06) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale:  en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages:  stats graphics grDevices utils datasets methods  base other attached packages:  googleVis_0.5.0-4 igraph_0.7.0 loaded via a namespace (and not attached):  RJSONIO_1.0-3 tools_3.0.3