# Comparing plotly & ggplotly plot generation times

**R on The Jumping Rivers Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The **plotly** package. A godsend for interactive documents, dashboard and presentations. For such documents there is no doubt that anyone would prefer a plot created in **plotly** rather than **ggplot2**. Why? Using **plotly** gives you neat and crucially *interactive* options at the top, where as **ggplot2** objects are static. In an app we have been developing here at Jumping Rivers, we found ourselves asking the question would it be quicker to use `plot_ly()`

or wrapping a **ggplot2** object in `ggplotly()`

? I found the results staggering.

### Prerequisites

Throughout we will be using the packages: **dplyr**, **tidyr**, **ggplot2**, **plotly** and **microbenchmark**. The data in use is the `birthdays`

dataset in the **mosaicData** package. This data sets contains the daily birth count in each state of the USA from 1969 – 1988. The packages can be installed in the usual way (remember you can install packages in parallel)

install.packages(c("mosaicData", "dplyr", "tidyr", "ggplot2", "plotly", "microbenchmark"))

library("mosaicData") library("dplyr") library("tidyr") library("ggplot2") library("plotly") library("microbenchmark")

### Analysis

Let’s load and take a look at the data.

data("Birthdays", package = "mosaicData") head(Birthdays) ## state year month day date wday births ## 1 AK 1969 1 1 1969-01-01 Wed 14 ## 2 AL 1969 1 1 1969-01-01 Wed 174 ## 3 AR 1969 1 1 1969-01-01 Wed 78 ## 4 AZ 1969 1 1 1969-01-01 Wed 84 ## 5 CA 1969 1 1 1969-01-01 Wed 824 ## 6 CO 1969 1 1 1969-01-01 Wed 100

First, we’ll create a very simple scatter graph of the mean births in every year.

meanb = Birthdays %>% group_by(year) %>% summarise(mean = mean(births))

Wrapping this as a **ggplot** object inside `ggplotly()`

we obtain this…

ggplotly(ggplot(meanb) + geom_point(aes(y = mean, x = year, colour = year)))

Whilst using `plot_ly()`

give us this…

plot_ly(data = meanb, y = ~mean, x = ~year, color = ~year, type = "scatter")

Both graphs are, identical, bar styling, yes?

Now let’s use `microbenchmark`

to see how their timings compare (for an overview on timing R functions, see our previous blog post).

time = microbenchmark::microbenchmark( ggplotly = ggplotly(ggplot(meanb) + geom_point(aes(y = mean, x = year, colour = year))), plotly = plot_ly(data = meanb, y = ~mean, x = ~year, color = ~year, type = "scatter"), times = 100, unit = "s") time ## Unit: seconds ## expr min lq mean median uq max neval cld ## ggplotly 0.050139 0.052229 0.070750 0.054760 0.056785 1.56652 100 b ## plotly 0.002475 0.002527 0.003017 0.002571 0.002674 0.03061 100 a

autoplot(time)

Now I thought nesting a **ggplot** object within `ggplotly()`

would be slower than using `plot_ly()`

, but I didn’t think it would be this slow. On average `ggplotly()`

is approximately 23 times slower than `plot_ly()`

. *23!*

Let’s take it up a notch. There we were plotting only 20 points, what about if we plot over 20,000? Here we will plot the min, mean and max births on each day.

date = Birthdays %>% group_by(date) %>% summarise(mean = mean(births), min = min(births), max = max(births)) %>% gather(birth_stat, value, -date)

Wrapping this a **ggplot2** object inside `ggplotly()`

we obtain this graph…

ggplotly(ggplot(date) + geom_point(aes(y = value, x = date, colour = birth_stat)))

**leave a comment**for the author, please follow the link and comment on their blog:

**R on The Jumping Rivers Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.