Containerizing Interactive R Markdown Documents

Posted on July 8, 2022 by Peter Solymos in R bloggers | 0 Comments

[This article was first published on R - Hosting Data Apps, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Containerizing Interactive R Markdown Documents

The rmarkdown package is behind the versatility of R Markdown with dozens of standard and community-provided output formats, ranging from HTML, Word, and PDF, to slides, books, and interactive documents. This abundance of awesomeness is a direct continuation of a long line of predecessors: Sweave/LaTeX, knitr, and pandoc. Its success is the foundation upon which Quarto is built on.

The htmlwidgets R package provides the basis for interactive JavaScript widgets that you can embed in HTML outputs. These are pre-rendered objects that respond to various gestures, like hover and click events. You just render the document once, and you are done until the next time when the document needs updating.

True reactivity, however, requires a lot more JavaScript heavy-lifting – i.e. using Observable – or you can use Shiny as the runtime for the R Markdown document. Such documents require a web server to watch for reactive updates in the background. This makes them effectively Shiny apps.

As with any type of Shiny app, a lot of the hosting options out there require the Shiny app to run inside of a Docker container (e.g. Heroku, ShinyProxy, Fly). Because interactive R Markdown documents differ from Shiny apps in subtle ways, serving them is also slightly different. In this post, we review how to “dockerize” R Markdown documents with different runtime environments.

Prerequisites

We will use the script from the analythium/rmarkdown-docker-examples GitHub repository.

You can also pull the following two Docker images:

docker pull eddelbuettel/r2u:20.04
docker pull nginx:alpine

Runtime: Shiny

The way to make R Markdown document interactive/reactive is to add runtime: shiny to the document’s YAML header. Now you can add Shiny widgets and Shiny render functions to the file’s R code chunks. This way the rendered HTML document will include reactive components.

Here is the runtime-shiny/index.Rmd file as our first document (following this example):

---
title: "Runtime: shiny"
output: html_document
runtime: shiny
---

Here are two Shiny widgets

```{r echo = FALSE}
selectInput("n_breaks",
  label = "Number of bins:",
  choices = c(10, 20, 35, 50),
  selected = 20)
sliderInput("bw_adjust",
  label = "Bandwidth adjustment:",
  min = 0.2,
  max = 2,
  value = 1,
  step = 0.2)
```

And here is a histogram

```{r echo = FALSE}
renderPlot({
  hist(faithful$eruptions,
    probability = TRUE,
    breaks = as.numeric(input$n_breaks),
    xlab = "Duration (minutes)",
    main = "Geyser eruption duration")
  dens <- density(faithful$eruptions,
    adjust = input$bw_adjust)
  lines(dens,
    col = "blue")
})
```

You should use rmarkdown::run() instead of rmarkdown::render("index.Rmd") to get the Shiny app running that will look like this:

We will use the following Dockerfile:

FROM eddelbuettel/r2u:20.04

RUN apt-get update && apt-get install -y --no-install-recommends \
    pandoc \
    && rm -rf /var/lib/apt/lists/*

RUN install.r shiny rmarkdown

RUN addgroup --system app && adduser --system --ingroup app app
WORKDIR /home/app
COPY runtime-shiny .
RUN chown app:app -R /home/app
USER app

EXPOSE 3838

CMD ["R", "-e", "rmarkdown::run(shiny_args = list(port = 3838, host = '0.0.0.0'))"]

Here is the explanation for each line:

the eddelbuettel/r2u parent image represents one of the most significant improvements in developer experience in the past few years, it cuts Docker build times to seconds due to full dependency resolution and using Ubuntu's apt package manager (read more about it here)
we need a newer version of pandoc than the standard package for the fancy R Markdown features we are using
install R packages
add a user named app and create a /home/app folder for this user
copy the contents of the runtime-shiny folder into the /home/app folder
set file permissions and set the app user the user of the container
expose port 3838
define the command using rmarkdown::run() and making sure Shiny runs on the port that we expect it

You can build and run the image:

docker build -f Dockerfile.shiny -t psolymos/rmd:shiny .

docker run -p 8080:3838 psolymos/rmd:shiny

Visit localhost:8080 to see the R Markdown document running as a Shiny app.

However, because it requires a full document render for each end user browser session it can perform poorly for documents that don’t render quickly.

Runtime: Shinyrmd

Prerendered Shiny documents represent an improvement. The Shiny runtime can perform poorly for documents that don’t render quickly. This is where runtime: shinyrmd (or its alias, runtime: shiny_prerendered) comes in. Such documents are pre-rendered before deployment so that the HTML loads faster. No need to wait for Shiny to render it for us.

The Shinyrmd runtime also comes with various contexts: server-start/setup/data (that is analogous to global.R), render (like the UI), and server. These contexts provide a hybrid model of execution, where some code is run once when the document is pre-rendered and some code is run every type the user interacts with the document.

The runtime-shinyrmd folder contains another Rmd file (based on this flexdashboard example):

---
title: "Runtime: shinyrmd"
output: flexdashboard::flex_dashboard
runtime: shinyrmd
---

```{r setup, include=FALSE}
library(dplyr)
knitr::opts_chunk$set(echo = FALSE)
```

```{r data, include=FALSE}
faithful_data <- sample_n(faithful, 100)
```

Column {.sidebar}
--------------------------------------------

```{r}
selectInput("n_breaks",
  label = "Number of bins:",
  choices = c(10, 20, 35, 50),
  selected = 20)
sliderInput("bw_adjust",
  label = "Bandwidth adjustment:",
  min = 0.2,
  max = 2,
  value = 1,
  step = 0.2)
```

Based on [this](...) example.

Column
--------------------------------------------

### Geyser Eruption Duration

```{r}
plotOutput("eruptions")
```

```{r, context="server"}
output$eruptions <- renderPlot({
  hist(faithful_data$eruptions,
    probability = TRUE,
    breaks = as.numeric(input$n_breaks),
    xlab = "Duration (minutes)",
    main = "Geyser Eruption Duration")
  dens <- density(faithful_data$eruptions,
    adjust = input$bw_adjust)
  lines(dens,
    col = "blue")
})
```

You can render and run with rmarkdown::run():

The Dockerfile is slightly modified from the Shiny runtime:

we need 2 more dependencies
we need to pre-render the document with rmarkdown::render() so that it is there when we spin up the container

FROM eddelbuettel/r2u:20.04

RUN apt-get update && apt-get install -y --no-install-recommends \
    pandoc \
    && rm -rf /var/lib/apt/lists/*

RUN install.r shiny rmarkdown flexdashboard dplyr

RUN addgroup --system app && adduser --system --ingroup app app
WORKDIR /home/app
COPY runtime-shinyrmd .
RUN R -e "rmarkdown::render('index.Rmd')"
RUN chown app:app -R /home/app
USER app

EXPOSE 3838

CMD ["R", "-e", "rmarkdown::run(shiny_args = list(port = 3838, host = '0.0.0.0'))"]

Build and run:

docker build -f Dockerfile.shinyrmd -t psolymos/rmd:shinyrmd .

docker run -p 8080:3838 psolymos/rmd:shinyrmd

Visit localhost:8080 to see the R Markdown document running as a pre-rendered Shiny app.

The docker build is super fast, thanks to the r2u image we used. The image size is around 1 GB, a bit larger than the ~800 GB parent image.

Runtime: Static

Static runtime, as its name implies, creates a static document. It stays the same until some of the document's inputs (images, data) change and the document is re-rendered. This gives us an easy way to just locally render the HTML document, copy it into a Docker image, then serve it using Nginx using this Dockerfile:

FROM nginx:alpine
COPY runtime-static/index.html /usr/share/nginx/html/index.html
CMD ["nginx", "-g", "daemon off;"]

This creates a tiny image (30 MB). Run the container and forward the port 80 where Nginx serves the static files to see the result.

What if you want to take advantage of a Docker-based build environment? You might experience issues with some of the dependencies on certain operating systems, or your IT department might not allow you to install packages yourself but you can use Docker ... Or what if you just want to complicate something that should be simple?

This brings us to a neat Docker build feature called multi-stage builds. We know that our Ubuntu-based image is quite big, so we only want to use that to render the HTML. Once it is done, we just insert that artifact into the small Nginx image.

Multi-stage build

With multi-stage builds, you use multiple FROM statements in your Dockerfile. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.

Let's see how this works for our R Markdown example. Here is the stripped-down static index.Rmd file from the runtime-static folder:

---
title: "Runtime: static"
output: flexdashboard::flex_dashboard
runtime: static
---

```{r setup, include=FALSE}
library(dplyr)
knitr::opts_chunk$set(echo = FALSE)
```

```{r data, include=FALSE}
faithful_data <- sample_n(faithful, 100)
```

Column {.sidebar}
--------------------------------------

Based on [this](...) example.

Column
-------------------------------------

### Geyser Eruption Duration

```{r}
hist(faithful_data$eruptions,
  probability = TRUE,
  breaks = 20,
  xlab = "Duration (minutes)",
  main = "Geyser Eruption Duration")
dens <- density(faithful_data$eruptions,
  adjust = 1)
lines(dens,
  col = "blue")
```

The rendered document:

Here is the 2-stage Dockerfile:

FROM eddelbuettel/r2u:20.04 AS builder
RUN apt-get update && apt-get install -y --no-install-recommends \
    pandoc \
    && rm -rf /var/lib/apt/lists/*
RUN install.r shiny rmarkdown flexdashboard dplyr
WORKDIR /root
COPY runtime-static .
RUN R -e "rmarkdown::render('index.Rmd', output_dir = 'output')"

FROM nginx:alpine
COPY --from=builder /root/output /usr/share/nginx/html
CMD ["nginx", "-g", "daemon off;"]

The 1st stage looks familiar, except we don't worry about being the root user for the build step. We name this stage builder using AS {name} after the FROM instruction.

The 2nd stage uses another FROM instruction, and we specify that we COPY from the builder stage: --from=builder. We grab all the rendered HTML and move it to the Nginx HTML folder to be served by the file server.

We just took advantage of the R build environment to render the document, and we ended up with a minimal-sized image with the static content inside.

Build and run:

docker build -f Dockerfile.static -t psolymos/rmd:static .

docker run -p 8080:80 psolymos/rmd:static

Conclusions

The Shiny and the pre-rendered Shinyrmd runtimes for R Markdown make it possible to write interactive documents that users can interact with. This is a great way to get started with reactive programming for folks who are already familiar with R Markdown.

We can treat such interactive documents similarly to Shiny apps and deploy them using Docker containers. When it comes to static R Markdown documents, there is nothing that can prevent us from serving these from containers. We learned how to minify the Docker image using a multi-stage build.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Containerizing Interactive R Markdown Documents

Prerequisites

Runtime: Shiny

Runtime: Shinyrmd

Runtime: Static

Multi-stage build

Conclusions

Further reading

Related

Prerequisites

Runtime: Shiny

Runtime: Shinyrmd

Runtime: Static

Multi-stage build

Conclusions

Further reading

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)