Comparing plotly & ggplotly plot generation times

Posted on November 26, 2017 by R on The Jumping Rivers Blog in R bloggers | 0 Comments

[This article was first published on R on The Jumping Rivers Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The plotly package. A godsend for interactive documents, dashboard and presentations. For such documents there is no doubt that anyone would prefer a plot created in plotly rather than ggplot2. Why? Using plotly gives you neat and crucially interactive options at the top, where as ggplot2 objects are static. In an app we have been developing here at Jumping Rivers, we found ourselves asking the question would it be quicker to use plot_ly() or wrapping a ggplot2 object in ggplotly()? I found the results staggering.

Prerequisites

Throughout we will be using the packages: dplyr, tidyr, ggplot2, plotly and microbenchmark. The data in use is the birthdays dataset in the mosaicData package. This data sets contains the daily birth count in each state of the USA from 1969 – 1988. The packages can be installed in the usual way (remember you can install packages in parallel)

install.packages(c("mosaicData", "dplyr", 
                   "tidyr", "ggplot2", 
                   "plotly", "microbenchmark"))
library("mosaicData")
library("dplyr")
library("tidyr")
library("ggplot2")
library("plotly")
library("microbenchmark")

Analysis

Let’s load and take a look at the data.

data("Birthdays", package = "mosaicData")
head(Birthdays)
##   state year month day       date wday births
## 1    AK 1969     1   1 1969-01-01  Wed     14
## 2    AL 1969     1   1 1969-01-01  Wed    174
## 3    AR 1969     1   1 1969-01-01  Wed     78
## 4    AZ 1969     1   1 1969-01-01  Wed     84
## 5    CA 1969     1   1 1969-01-01  Wed    824
## 6    CO 1969     1   1 1969-01-01  Wed    100

First, we’ll create a very simple scatter graph of the mean births in every year.

meanb = Birthdays %>% 
    group_by(year) %>% 
    summarise(mean = mean(births))

Wrapping this as a ggplot object inside ggplotly() we obtain this…

ggplotly(ggplot(meanb) + 
  geom_point(aes(y = mean, x = year, colour = year)))

Whilst using plot_ly() give us this…

plot_ly(data = meanb, 
        y = ~mean, x = ~year, color = ~year, 
        type = "scatter")

Both graphs are, identical, bar styling, yes?

Now let’s use microbenchmark to see how their timings compare (for an overview on timing R functions, see our previous blog post).

time = microbenchmark::microbenchmark(
        ggplotly = ggplotly(ggplot(meanb) + 
                            geom_point(aes(y = mean, x = year, colour = year))),
        plotly = plot_ly(data = meanb, 
                         y = ~mean, x = ~year, 
                         color = ~year, type = "scatter"),
        times = 100, unit = "s")
time
## Unit: seconds
##      expr      min       lq     mean   median       uq     max neval cld
##  ggplotly 0.050139 0.052229 0.070750 0.054760 0.056785 1.56652   100   b
##    plotly 0.002475 0.002527 0.003017 0.002571 0.002674 0.03061   100  a
autoplot(time)

Now I thought nesting a ggplot object within ggplotly() would be slower than using plot_ly(), but I didn’t think it would be this slow. On average ggplotly() is approximately 23 times slower than plot_ly(). 23!

Let’s take it up a notch. There we were plotting only 20 points, what about if we plot over 20,000? Here we will plot the min, mean and max births on each day.

date = Birthdays %>% 
  group_by(date) %>% 
  summarise(mean = mean(births), min = min(births), max = max(births)) %>% 
  gather(birth_stat, value, -date)

Wrapping this a ggplot2 object inside ggplotly() we obtain this graph…

ggplotly(ggplot(date) +
    geom_point(aes(y = value, x = date, colour = birth_stat)))

To leave a comment for the author, please follow the link and comment on their blog: R on The Jumping Rivers Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Prerequisites

Analysis

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)