Major update of D3partitionR: Interactive viz’ of nested data with R and D3.js

[This article was first published on Enhance Data Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

D3partitionR is an R package to visualize interactively nested and hierarchical data using D3.js and HTML widget. These last few weeks I’ve been working on a major D3partitionR update which is now available on GitHub. As soon as enough feedbacks are collected, the package will be on uploaded on the CRAN. Until then, you can install it using devtools

[sourcecode language="r"]
library(devtools)
install_github("AntoineGuillot2/D3partitionR")
[/sourcecode]

Here is a quick overview of the possibilities using the Titanic data:

Example D3PartitionR

A major update

This update is a major update from the previous version which will break code from 0.3.1

New functionalities

  • Additional data for nodes: Additional data can be added for some given nodes. For instance, if a comment or a link needs to be shown in the tooltip or label of some nodes, they can be added through the add_nodes_data function

    Additional data for a node
    You can easily add specific hyperlink or text in the tooltip
  • Variable selection and computation, now, you can provide a variable for:
    • sizing (i.e. the size of each node)
    • color, any variable from your data.frame or from the nodes data can be used as a color.
    • label, any variable from your data.frame or from the nodes data can be used as a label.
    • tooltip, you can provide several variables to be displayed in the tooltip.
    • aggregation function, when numerical variables are provided, you can choose the aggregation function you want.
  • Coloring: The color scale can now be continuous. For instance, you can use the mean survival rate to the Titanic accident in each node, this make it easy to visualise quickly women in 1st class are more likely to survive than men in 3rd class.
Continuous color scale D3partitionR
Treemap to show the survival rate to the Titanic accident
  • Label: Labels providing the showing the node’s names (or any other variable) can now be added to the plot.
  • Breadcrumb: To avoid overlapping, the width of each breadcrumb is now variable and dependant on the length of the word
Variable breadcrumb width
  • Legend: By default, the legend now shows all the modalities/levels that are in the plot. To avoid wrapping, enabling the zoom_subset option will only shows the modalities in the direct children of the zoomed root.

API and backend change

  • Easy data preprocessing, The data preparation was tedious in the previous versions. Now, you just need to aggregate your data.frame at the right level, the data.frame can directly be used in the D3partitionR functions to avoid to deal with nesting a data.frame which can be pretty complicated.
[sourcecode language="r"]
##Loading packages
require(data.table)
require(D3partitionR)

##Reading data
titanic_data=fread("train.csv")

##Agregating data to have unique sequence for the 4 variables
var_names=c('Sex','Embarked','Pclass','Survived')
data_plot=titanic_data[,.N,by=var_names]
data_plot[,(var_names):=lapply(var_names,function(x){data_plot[[x]]=paste0(x,' ',data_plot[[x]])
  })]
[/sourcecode]
  • The R API is greatly improved, D3partitionR are now S3 objects with a clearly named list of function to add data and to modify the chart appearance and parameters. Using pipes now makes D3partitionR syntax looks gg-like
[sourcecode language="r"]
##Treemap
D3partitionR()%>%
  add_data(data_plot,count = 'N',steps=c('Sex','Embarked','Pclass','Survived'))%>%
  set_chart_type('treemap')%>%
  plot()

##Circle treemap
D3partitionR()%>%
    add_data(data_plot,count = 'N',steps=c('Sex','Embarked','Pclass','Survived'))%>%
    set_chart_type('circle_treemap')%>%
    plot()
[/sourcecode]

Style consistency among the different type of chart. Now, it’s easy to switch from a treemap to a circle treemap or a sunburst and keep consistent styling policy.

Update to d3.js V4 and modularization. Each type of charts now has its own file and function. This function draws the chart at its root level with labels and colors, it returns a zoom function. The on-click actions (such as the breadcrumb update or the legend update) and the hover action (tooltips) are defined in a ‘global’ function.

Hence, adding new visualizations will be easy, the drawing and zooming script will just need to be adapted to this previous template.

What’s next

Thanks to the several feedbacks that will be collected during next week, a stable release version should soon be on the CRAN. I will also post more ressources on D3partitionR with use cases and example of Shiny Applications build on it.

The post Major update of D3partitionR: Interactive viz’ of nested data with R and D3.js appeared first on Enhance Data Science.

To leave a comment for the author, please follow the link and comment on their blog: Enhance Data Science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)