Introducing trelliscopejs
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I’m really excited to announce the beta release of a visualization project I’ve put a lot of work into for the past several months, trelliscopejs.
trelliscopejs is an R package that brings faceted visualizations to life while plugging in to common analytical workflows like ggplot2 or the “tidyverse”. To quickly get a feel for it, take a look at this screen capture:
I’d highly recommend that you read the full documentation for the package to get a full picture of what the package is about, but to keep this post concise and not bore you with the details, here are a couple of examples.
First let’s install some packages that we’ll use in the examples:
devtools::install_github("hafen/trelliscopejs") install.packages(c("gapminder", "housingData", "rbokeh"))
Gapminder with ggplot2
The first example is for ggplot2 users. You can swap facet_wrap()
for facet_trelliscope()
and write code like this:
library(trelliscopejs) library(gapminder) qplot(year, lifeExp, data = gapminder) + xlim(1948, 2011) + ylim(10, 95) + theme_bw() + facet_trelliscope(~ country + continent, nrow = 2, ncol = 7, width = 300)
To create a visualization like this:
If this display doesn’t appear correctly for you (because of blog aggretagors, etc.), you can follow this link to the display in a dedicated window.
This is a display of life expectancy across time for countries in the gapminder dataset. Instead of a static faceted plot, we get an interactive display. We can interact with the panels by sorting and filtering on metrics that were computed about each subset of our data being plotted, and we can paginate through panels when they don’t all fit on one page.
Go ahead and experiment with the interactive controls in the plot above. You can click the fullscreen button in the bottom right if you want more space. The “question mark” icon in the upper right corner will give you more information about how to use the viewer.
While interacting with the display, do you see anything interesting? The data being plotted is fairly simple – we see a usually steady increase in life expectancy over time, with varying mean by country. However, the eye is able to quickly catch deviations from the normal pattern such as the dips in Rwanda and Cambodia or the peaks in life expectancy in several African countries in the late 80s / early 90s with life expectancy decreasing after. The plots of the raw data have a story to tell that you might miss if you just computed summaries.
Housing data with dplyr and rbokeh
The housingData package has a dataset, “housing” that gives the monthly median list and sold price for residential homes by US county, provided by Zillow.
Let’s take a look at the median list price over time by county, but this time illustrating the use of the trelliscope()
function with dplyr. We will create the plot using rbokeh although you can use any plotting library you’d like to.
The trelliscope()
function is meant to be used in “tidyverse” pipelines, with the idea that a faceted display can be described by a data frame of summaries computed on groups, with one of the summaries being a plot object.
In the example below, we group the housing data by county and state and then compute some summaries (the slope of the list price vs. time, the mean list price, the mean sold price, and the number of non-NA observations). We also compute a “summary” plot of the median list price vs. time.
library(rbokeh) library(dplyr) library(housingData) lm_coefs <- function(x, y) coef(lm(y ~ x)) d <- housing %>% group_by(county, state) %>% summarise( slope = lm_coefs(time, medListPriceSqft)[2], mean_list = mean(medListPriceSqft, na.rm = TRUE), mean_sold = mean(medSoldPriceSqft, na.rm = TRUE), n_obs = length(which(!is.na(medListPriceSqft))), zillow_link = cog_href( sprintf("http://www.zillow.com/homes/%s_rb/", gsub(" ", "-", paste(county, state)))[1]), panel = panel( figure(xlab = "time", ylab = "median list / sq ft", toolbar = NULL) %>% ly_points(time, medListPriceSqft, hover = data_frame(time = time, mean_list = medListPriceSqft))) ) %>% filter(n_obs > 1)
To emphasize that the result of this is simply a data frame:
Source: local data frame [2,975 x 8] Groups: county [1,772] county state mean_list mean_sold n_obs panel <fctr> <fctr> <dbl> <dbl> <int> <list> 1 Abbeville County SC 72.76035 61.69598 77 <S3: rbokeh> 2 Acadia Parish LA 67.18250 73.64299 77 <S3: rbokeh> 3 Accomack County VA 123.22507 54.03628 81 <S3: rbokeh> 4 Ada County ID 104.24764 NaN 81 <S3: rbokeh> 5 Adair County IA 64.20355 NaN 76 <S3: rbokeh> 6 Adair County KY 70.14871 50.51296 77 <S3: rbokeh> 7 Adair County MO 68.75936 281.20160 81 <S3: rbokeh> 8 Adair County OK 67.41909 NaN 81 <S3: rbokeh> 9 Adams County CO 121.83557 130.31120 81 <S3: rbokeh> 10 Adams County IA 124.56285 NaN 76 <S3: rbokeh> # ... with 2,965 more rows, and 2 more variables: slope <dbl>, # zillow_link <chr>
Note that one of our columns is an rbokeh object. We can now pipe this data frame into trelliscope()
, which will create a display for us. The summaries we computed will be made available as metrics we can use to interact with the panels in the display, and the panels will be created based on our panel
column.
d %>% trelliscope(name = "list_vs_time")
Here is the resulting display:
If this display doesn’t appear correctly, please visit this link.
There are a lot of fun things you can explore with this display. Which counties are the cheapest to live in? Which were not effected by the housing crisis? What is happening in your county or state?
Easy to embed and share
I haven’t mentioned yet that trelliscopejs is an htmlwidget, producing pure HTML / JavaScript applications, meaning you can easily embed your displays in RMarkdown Notebooks or documents, and can share the generated HTML file with others or post on the web through a simple web server or Github pages. For example, the displays you saw in this post are hosted on Github pages.
Why is this useful?
Trelliscope is based on the idea of “small multiples”, a simple but serious visualization technique. For more on the virtues of “small multiples”, please read here.
What’s next
This post provides two examples of interfaces for creating Trelliscope displays, but it would not be difficult to support other workflows as well. We are working on making sure we provide the most convenient interfaces to the most common workflows, so there will be some iteration on getting that right.
If you want to see some other things we plan on making happen for trelliscopejs, see here.
Give it a try!
I hope you will find some interesting use cases and give trelliscopejs a try on your data. Again, please read the full documentation for more on getting going with the package. Although I stated that it is a beta release, I’ve waited to announce it until things have become quite stable. Don’t be surprised if there are minor tweaks in the future, but also don’t be afraid to give it a try!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.