Hierarchical Visualizations in R and the Javascript InfoVis Toolkit

July 12, 2010

(This article was first published on R-Chart, and kindly contributed to R-bloggers)

I love R.  It is really a great language and platform for statistical work and graphing.  But every technology has its limits – and other tools can be meet different needs.  So in this post, I will start with R and move on to the JavaScript InfoVis Toolkit.  And I must admit, I can’t say that I know the limits of R.  I am regularly corrected by the friendly and informed R community which makes this blog better – and helps me as well.

I have been regularly considering the best ways to visualize trees – especially large ones.  Having a background in Oracle, I often revisit the employee hierarchy that is included as part of the HR demonstration schema.

The query to construct a result set that represents the employee hierarchy relies upon Oracle’s non-standard hierarchical query syntax.  Employees are associated with other records in the employee table by a manager id.  The root node of the tree is the record with a null manager id (the top manager who has no superior within the organization).

ch = odbcConnect(“XE”,uid=”HR”,pwd=”HR”)

      replace(m.first_name,’ ‘,’_’)||’_’||
      replace(m.last_name,’ ‘,’_’) manager_name,
      replace(e.first_name,’ ‘,’_’)||’_’||
      replace(e.last_name,’ ‘,’_’) employee_name
   FROM employees e    
   LEFT OUTER JOIN employees m 
   ON m.employee_id = e.manager_id  
   WHERE e.manager_id is not null
   START WITH e.manager_id IS NULL
   CONNECT BY PRIOR e.employee_id = e.manager_id
   ORDER siblings BY e.last_name,e.first_name”

r=sqlQuery(ch, paste(sql, collapse=’ ‘))

The data returned is a simple listing that pairs up each employee with his/her manager.

1     Steven_King            Gerald_Cambrault
2     Gerald_Cambrault       Elizabeth_Bates
3     Gerald_Cambrault       Harrison_Bloom
4     Gerald_Cambrault       Tayler_Fox
5     Gerald_Cambrault       Sundita_Kumar
6     Gerald_Cambrault       Lisa_Ozer

The igraph library can then be used to plot the data.

g = graph.data.frame(r, directed = T)
V(g)$label = V(g)$name

The results are a bit cluttered due to the number of nodes – but can be manually manipulated when viewed using tkplot.  The typical organizational chart structure can be obtained using the Reingold-Tilford layout.

Other possibilities include the circle layout…

…the Fruchterman-Reingold layout…

…and the Kamada-Kawai layout.

A better visualization technique when trying to analyze networks of this size in a small space is to utilize animation.  The following shows a couple of the animated charts available through the Javascript InfoVis framework.  These were produced using ruby and sinatra to return a JSON object that is rendered using Javascript.  The following is a hyperbolic tree with the root node in the center.
When a different node is clicked, it is shifted to the center and the remaining nodes are arranged around it.  In this example Gerald Cambrault is selected and the graph smoothly transitions to the image that follows.
If you prefer a tree that is more like the traditional organizational chart, the space tree might be used instead.
This visualization is also will adjust when a node is selected.
It looks nice and the animation is elegant and useful (good job Nicolas). The code is available on github at this location.  It is not the best implementation so feel free to fork it an provide a better solution. 

To leave a comment for the author, please follow the link and comment on their blog: R-Chart.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training





CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)