Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I love R.  It is really a great language and platform for statistical work and graphing.  But every technology has its limits – and other tools can be meet different needs.  So in this post, I will start with R and move on to the JavaScript InfoVis Toolkit.  And I must admit, I can’t say that I know the limits of R.  I am regularly corrected by the friendly and informed R community which makes this blog better – and helps me as well.

I have been regularly considering the best ways to visualize trees – especially large ones.  Having a background in Oracle, I often revisit the employee hierarchy that is included as part of the HR demonstration schema.

The query to construct a result set that represents the employee hierarchy relies upon Oracle’s non-standard hierarchical query syntax.  Employees are associated with other records in the employee table by a manager id.  The root node of the tree is the record with a null manager id (the top manager who has no superior within the organization).

library(RODBC)
ch = odbcConnect(“XE”,uid=”HR”,pwd=”HR”)

sql=”SELECT
replace(m.first_name,’ ‘,’_’)||’_’||
replace(m.last_name,’ ‘,’_’) manager_name,
replace(e.first_name,’ ‘,’_’)||’_’||
replace(e.last_name,’ ‘,’_’) employee_name
FROM employees e
LEFT OUTER JOIN employees m
ON m.employee_id = e.manager_id
WHERE e.manager_id is not null
CONNECT BY PRIOR e.employee_id = e.manager_id
ORDER siblings BY e.last_name,e.first_name”

r=sqlQuery(ch, paste(sql, collapse=’ ‘))
close(ch)

The data returned is a simple listing that pairs up each employee with his/her manager.

MANAGER_NAME           EMPLOYEE_NAME
1     Steven_King            Gerald_Cambrault
2     Gerald_Cambrault       Elizabeth_Bates
3     Gerald_Cambrault       Harrison_Bloom
4     Gerald_Cambrault       Tayler_Fox
5     Gerald_Cambrault       Sundita_Kumar
6     Gerald_Cambrault       Lisa_Ozer

The igraph library can then be used to plot the data.

library(igraph)
g = graph.data.frame(r, directed = T)
V(g)$label = V(g)$name
tkplot(g)

The results are a bit cluttered due to the number of nodes – but can be manually manipulated when viewed using tkplot.  The typical organizational chart structure can be obtained using the Reingold-Tilford layout.

Other possibilities include the circle layout…

…the Fruchterman-Reingold layout…

A better visualization technique when trying to analyze networks of this size in a small space is to utilize animation.  The following shows a couple of the animated charts available through the Javascript InfoVis framework.  These were produced using ruby and sinatra to return a JSON object that is rendered using Javascript.  The following is a hyperbolic tree with the root node in the center.

When a different node is clicked, it is shifted to the center and the remaining nodes are arranged around it.  In this example Gerald Cambrault is selected and the graph smoothly transitions to the image that follows.

If you prefer a tree that is more like the traditional organizational chart, the space tree might be used instead.
This visualization is also will adjust when a node is selected.

It looks nice and the animation is elegant and useful (good job Nicolas). The code is available on github at this location.  It is not the best implementation so feel free to fork it an provide a better solution.