Visualize large data sets with the bigvis package

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Creating visualizations of large data sets is a tough problem: with a limited number of pixels available on the screen (or just with the limited visual acuity of the human eye), massive numbers of symbols on the page can easily result in an uninterpretable mess. On Friday we shared one way of tackling the problem using Revolution R Enterprise: hexagonal binning charts. Relatedly, RStudio's chief scientist Hadley Wickham is taking a comprehensive approach to big-data visualization with his new open-source bigviz package, currently available on GitHub. (Disclaimer: early development of this package was funded by Revolution Analytics.)

The basic idea of the package is to use aggregation and smoothing techniques on big data sets before plotting, to create visualizations that give meaningful insights into the data, and that can be computed quickly and rendered efficiently using R's standard graphics engine. Despite the large data sets involved, the visualization functions in the pacakge are fast, because the “bin-summarise-smooth” cycle is performed directly in C++, direcly on the R object stored in memory. The system is described in detail in this Infovis preprint, and includes this example chart using the famous airline data:

Bigvis
The bigvis package is available for installation now on GitHub (use the devtools package to make the install easier). You can find installation instructions and links to the documentation in the README file linked below.

GitHub (Hadley Wickham): bigvis package

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)