R Markdown Tips and Tricks #3: Time-savers & Trouble-shooters

[This article was first published on RStudio | Open source & professional software for data science teams on RStudio, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Photo by Jeremy Bezanger on Unsplash

The R Markdown file format combines R programming and the markdown language to create dynamic, reproducible documents. Authors use R Markdown for reports, slide shows, blogs, books — even Shiny apps! Since users can do so much with R Markdown, it’s important to be efficient with time and resources.

We asked our Twitter friends the tips and tricks that they have picked up along their R Markdown journey. There was a flurry of insightful responses ranging from organizing files to working with YAML, and we wanted to highlight some of the responses so that you can apply them to your work, as well.

This is the third of a four-part series to help you on your path to R Markdown success, where we discuss features and functions that save you time and help you troubleshoot. You can find many of these tips and tricks in the R Markdown Cookbook. We’ve included the link to the relevant chapter (and other resources) in each section.

1. Convert an R script into an R Markdown document with knitr::spin()

Have you ever wished you could transform an R script into an R Markdown document without having to copy and paste your code? The function knitr::spin() lets you do just that. Pass your R script to spin() and watch the transformation happen.

File icons representing the conversion of an R script into an R Markdown document

You can quickly move from coding your analysis to writing your reports. In addition, you can keep your workflow reproducible by including knitr::spin() at the end of your R script. Rerun it any time you update your analysis so that your source code and R Markdown report are synced.

  • R Markdown Cookbook chapter: Render an R script to a report.
  • You can also transform an R script into a report using #' comments. The Render an R script chapter of Happy Git and GitHub for the useR chapter walks through how to create a render-ready R script.

2. Convert an R Markdown document into an R script with knitr::purl()

Now let’s flip it around! What if you want to extract only the R code from your R Markdown report? For this, use the function knitr::purl().

File icons representing the conversion of an R Markdown document script into an R script

The output from purl() can show no text, all text, or just the chunk options from your .Rmd file depending on the documentation argument.

# Extracts only pure R code
knitr::purl("script.R", documentation = 0L)

# Extracts R code and chunk options
knitr::purl("script.R", documentation = 1L)

# Extracts all text
knitr::purl("script.R", documentation = 2L)

If you do not want certain code chunks to be extracted, you can set the chunk option purl = FALSE.

```{r ignored}
#| purl = FALSE

x = rnorm(1000)
```

3. Reuse code chunks throughout your document

The knitr package provides several options to avoid copying and pasting your code. One way is to use reference labels with the chunk option ref.label. Let’s use the example from the R Markdown Cookbook. Say you have these two chunks:

```{r chunk-b}
# this is the chunk b
1 + 1
```

```{r chunk-c}
# this is the chunk c
2 + 2
```

You can write a chunk that combines chunk-c and chunk-b:

```{r chunk-a}
#| ref.label = c("chunk-c", "chunk-b")
```

Your chunk-a will render like this:

# this is the chunk c
2 + 2

## [1] 4

# this is the chunk b
1 + 1

## [1] 2

Please note that any code inside of chunk-a will not be evaluated.

One application of ref.label puts all of your code in an appendix. The code output will show up in the document’s main body, and the code chunks will appear only at the end.

# Appendix

```{r}
#| ref.label=knitr::all_labels(),
#| echo = TRUE,
#| eval = FALSE
```

4. Cache your chunks (with dependencies)

If there is a chunk in your R Markdown file that takes a while to run, you can set cache = TRUE to pre-save the results for the future. The next time you knit the document, the code will call the cached object rather than rerun (provided that nothing in the cached chunk has changed). This can save a lot of time when loading big files or running intensive processes.

```{r load-data}
#| cache = TRUE

dat <- read.csv("HUGE_FILE_THAT_TAKES_FOREVER_TO_LOAD.csv")
```

If one of your later chunks depends on the output from a cached chunk, include the dependson option. The chunk will rerun if something has changed in the cached chunk.

For example, say you run a function in one chunk and use the result in another chunk:

```{r cached-chunk}
#| cache = TRUE

x <- 500
x
```

```{r dependent-chunk}
#| cache = TRUE,
#| dependson = "cached-chunk"

x + 5
```

This will result in 505.

Now, you edit the cached chunk:

```{r cached-chunk}
#| cache = TRUE

x <- 600
x
```

With the dependson option, your dependent chunk will update when you edit your cached chunk. In this case, your dependent chunk will now output 605.

Without the dependson option, your dependent chunk will use the previously cached result and output 505, even though the cached chunk now says x <- 600.

Too much caching going on? You can reset all your caches by using a global chunk option in the first code chunk of your document, e.g., knitr::opts_chunk$set(cache.extra = 1). This chunk option name can be arbitrary but we recommend that you do not use an existing option name in knitr::opts_chunk$get() (e.g., cache.extra is not a built-in option). If you want to reset the caches again, you can set the option to a different value.

5. Save the content of a chunk elsewhere with the cat engine

You may want to write the content of a code chunk to an external file to use later in the document. The cat engine makes this possible. The files do not just have to be .R files either — they can be .txt., .sql, etc.

```{cat}
#| engine.opts = list(file = "example.txt")

This text will be saved as "example.txt" in your project directory.
```

One application is to use the cat engine to save a .sql file, then execute that file in a code chunk later in the document.

Create this SQL script in a chunk and save it in a file:

```{cat}
#| engine.opts = list(file = "tbl.sql", lang = "sql")

SELECT episode, title
FROM "database/elements"."elements"
LIMIT 3
```

Read it in your .Rmd file:

```{sql}
#| connection = con,
#| code = readLines("tbl.sql")
```

The cat engine is helpful when you want to create a reprex for your R Markdown documents. You contain everything within your document rather than having to attach external files so others can reproduce your work.

6. Include parameters to easily change values

Include parameters to set values throughout your report. When you need to rerun the report with new values, you will not have to manually change them throughout your document. For example, if you want to display data for a particular class of cars, set the parameter my_class:

---
title: "Daily Report"
output: "html_document"
params:
  my_class: "fuel economy"
---

In the document, reference the parameter with params$:

```{r setup}
#| include = FALSE

library(dplyr)
library(ggplot2)

mtcars_df <-
  mtcars %>%
  mutate(class = case_when(mpg > 15 ~ "fuel economy",
                           TRUE ~ "not fuel economy"))

class <- 
  mtcars_df %>% 
  filter(class == params$my_class)
```

# Wt vs mpg for `r params$my_class` cars

```{r}
ggplot(class, aes(wt, mpg)) + 
  geom_point() + 
  geom_smooth(se = FALSE)
```

To create a report that uses the new set of parameter values, you can use the rmarkdown::render() function. Add the params argument to render() with your updated value:

rmarkdown::render("paramDoc.Rmd", params = list(my_class = "not fuel economy"))

The report will now output the values for ‘not fuel economy’ cars.

7. Create templates with knit_expand()

With knitr::knit_expand(), you can replace expressions in {{}} with their values.

For example,

```{r}
knit_expand(text = 'The value of pi is {{pi}}.')
```

```
[1] "The value of pi is 3.14159265358979."
```

You can also create templates for your R Markdown files with knit_expand(). Create a file with the tabs or headings that you would like. Create a second file that loops through the data you would like to output.

Take this example from the R Markdown Cookbook. Create a template.Rmd file:

# Regression on {{i}}

```{r lm-{{i}}}
lm(mpg ~ {{i}}, data = mtcars)
```

Then create another file that loops through each of the variables in mtcars except mpg:

```{r}
#| echo = FALSE,
#| results = "asis"

src = lapply(setdiff(names(mtcars), 'mpg'), function(i) {
  knitr::knit_expand('template.Rmd')
})

res = knitr::knit_child(text = unlist(src), quiet = TRUE)
cat(res, sep = '\n')
```

Knit this file to create a report that applies the template the non-mpg variables:

 

In our original Twitter thread, Felipe Mattioni Maturana shows us an example from his work:

8. Exit knitting early with knit_exit()

Exit the knitting process before the end of the document with knit_exit(). Knitr will write out the results up to that point and ignore the remainder of the document.

You can use knit_exit() either inline or in a code chunk.

```{r chunk-one}
x <- 100
```

`r knitr::knit_exit()`

```{r chunk-two}
y
```

The rendered document will only show chunk-one. This is helpful if you run into errors and want to find where they are by splitting up your document.

Continue the Journey

We hope that these tips & tricks help you save time and troubleshoot in R Markdown. Thank you to everybody who shared advice, workflows, and features!

Stay tuned for the last post in this four-part series: Looks better, works better.

Resources

  • Peruse the R Markdown Cookbook for more tips and tricks.
  • RStudio Connect is an enterprise-level product from RStudio to publish and schedule reports, enable self-service customization, and distribute beautiful emails.
To leave a comment for the author, please follow the link and comment on their blog: RStudio | Open source & professional software for data science teams on RStudio.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)