11 tricks to level up your rmarkdown documents

Posted on April 15, 2023 by Code R in R bloggers | 0 Comments

[This article was first published on Code R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

For a while I wanted to write a post to compile some of the tricks I’ve learnt over the years of using rmarkdown. I also wanted other people’s input so I asked for suggestions on Mastodon. So here are the 12 tips I decided to include in no particular order.

Make chunk options non-optional

I use this trick to force myself to write captions to all figures:

knitr::opts_hooks$set(label = function(options) {
   if (is.null(options$fig.cap)) {
    stop("Every figure has to have a caption!")  
   }
   return(options)
})

This “hook” runs once for each chunk in which label is not NULL –which is all of them– and throws an error if the fig.cap option is NULL (missing). This will absolutely force your lazy ass to actually put captions in all your figures.

Inspired by Zhian N Kamvar I’m now also using this to force myself to always name all my chunks:

knitr::opts_hooks$set(label = function(options) {
  # Check if the label comes from the default label for unnamed chuncks
  default_label <- knitr::opts_knit$get("unnamed.chunk.label")
  has_default_label <- grepl(default_label, options$label)
  
  if (has_default_label) {
    stop("Name your chunks!")
  }
  return(options)
})

This code can’t just check for is.NULL(options$label) because unnamed chunks get default labels, so it gets the default label with knitr::opts_knit$get("unnamed.chunk.label") and then checks if the name of the chunk is auto-generated. This fails in the ridiculously edge case of manually-defined labels that contain the same text as the default label.

You can use these principles to run all kinds of checks on your chunk options. The only issue with this approach is that the knitting process ends when it finds the first “bad” instance. It might be better to record all the offending chunks and their reasons and then throw a message or error at the end.

Captions using text references

Good captions tend to be long and they sometimes include complex strings like $\LaTeX$ notation or references to previous figures (e.g. “Same as Fig. 2 but for clowns with red noses.”). These kinds of captions can be hard to parse visually as chunk options and need a lot of escaped characters if written as a string. Bookdown offers text references as a solution.

You can start a line of text with (ref:label) and then use (ref:label) to refer to it in your chunk options. So you can write

(ref:red-noses-cap) Same as Figure \@ref(fig:blue-noses) but for clowns with *red* noses. And this is a mathematical formula just as an example: $\pi=3$.

And then write the chunk as

```{r, fig.cap = "(ref:red-noses-cap)"}
plot(red_noses)
```

Default caption

When using text references to define captions, I’ve found that it’s a bit annoying and redundant to explicitly set the fig.cap option. Instead I prefer to set up a default caption of the form (ref:label-cap) like this:

knitr::opts_hooks$set(label = function(options) {
  if (is.null(options$fig.cap)) {
    options$fig.cap <- paste0("(ref:", options$label, "-cap)")
  }
})

With this, every chunk will have a default text-referenced caption with a predictable name.

Save plots in multiple formats

Did you know that the dev chunk option can be a vector of formats? This enables you to save figures in multiple formats at once.

knitr::opts_chunk$set(dev = c('png', 'svg'))

This simple but possibly overlook feature (suggested by Robert Flight) can be useful if you want to use vector graphics in your document but also need raster versions to share more easily with your colleges or online.

Exit prematurely

With long and complex documents sometimes come weird errors that are hard to pin down. Or you might want to work on some early part of the document even if some later parts are unfinished and don’t knit. Both Mickaël CANOUIL and superboreen pointed out that you can use knitr::knit_exit() to end document rendering “before it gets to the hideous code I haven’t fixed yet”.

knitr::knit_exit()

Get the output format

While the promise of rmarkdown is to have portable code that can be rendered into any document, but abstractions are leaky and this doesn’t always work out. For example, I’ve found that there’s not a single table-generating package that does a good job of rendering decent-looking LaTeX, HTML and Word tables without changes in the code. So sometimes the code needs to know to which document format it’s going to render.

The function knitr::pandoc_to() returns the “final destination” of the document, which can be “latex”, “html” or “docx”.

knitr::pandoc_to()

It can also return a logical indicating if the output format is the one specified in the argument. This enables code that only runs for some formats and not others:

if (knitr::pandoc_to("docx")) {
   # Something to do only if the output is docx
}

Beware that knitr::pandoc_to() will return NULL when run interactively, so you might want to catch that case.

Other similar functions are is_latex_output() and is_html_output().

Configure cache path

My documents often have some code that takes a while to run so I make liberal use of the cache feature. But sometimes I like to control where that cache is stored. That can be done with knitr::opts_chunk$set(cache.path = path).

This is a good solution when rendering to multiple formats, since changing the format seems to invalidate the cache and make it useless. So what I do is set up one cache for each format:

format <- knitr::pandoc_to()

knitr::opts_chunk$set(
  cache.path = file.path("cache", format, "")  # The last "" is necessary
)

Get current file

The knitr::current_input() returns the input file being rendered by knitr. This can be useful in a bunch of cases, but I use it to, again, control cache and figure locations.

On a bookdown document, I like each chapter to use its own folder for cache and figures, so I have this in my setup chunk:

format <- knitr::pandoc_to() 
chapter <- tools::file_path_sans_ext(knitr::current_input())

knitr::opts_chunk$set(
  fig.path   = file.path("figures", chapter, ""),
  cache.path = file.path("cache", chapter, format, "")
)

Easily invalidate cache

Speaking of cache, sometimes I want to run your document from scratch without the cache. Either as a final test that all the code runs well, or when faced with strange bugs that I suspect might be cache-related.

So I almost always set up a cache.extra chunk option that will invalidate the cache each time it changes.

knitr::opts_chunk$set(cache.extra = 42)  # Change the number to invalidate cache

Do stuff after knitting

Knitr runs the “document” hook after knitting. You can customise that hook to do whatever you want:

# First save the default hook to use later. 
knit_doc <- knitr::knit_hooks$get("document")

knitr::knit_hooks$set(document = function(x) {
   # So stufff
   knit_doc(x)  # Then do whatever knitr was going to do anyway
})

For example, I sometimes like to add this so I get a desktop notification when my computer finishes knitting

# This needs the notify-send cli. 
# https://manpages.ubuntu.com/manpages/bionic/man1/notify-send.1.html
notify <- function(title = "title", text = NULL, time = 2) {
   time <- time*1000
   system(paste0('notify-send "', title, '" "', text, '" -t ', time, ' -a rstudio'))
}


start_time <- unclass(Sys.time())
min_time <- 5*3600  # Only notify for long-running jobs (5 minutes)

knit_doc <- knitr::knit_hooks$get("document")
knitr::knit_hooks$set(document = function(x) {
   
   took <- unclass(Sys.time()) - start_time
   if (took >= min_time) {
      notify("Done knitting!", 
             paste0("Took ", round(took), " seconds"),
             time = 5)
   }  
   
   
   knit_doc(x)
})

You might use this to get notifications to your phone with RPushbullet or send emails with emayili. You might also want to check the notifier package.

Scripts to and from RMarkdown

Finally, Katie highlighted the knitr::spin() function, which turns specially formatted R scripts and turns them into RMarkdown documents. And for the exact opposite workflow, Ken Butler points out the knitr::purl() function.

To leave a comment for the author, please follow the link and comment on their blog: Code R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

11 tricks to level up your rmarkdown documents

Make chunk options non-optional

Captions using text references

Default caption

Save plots in multiple formats

Exit prematurely

Get the output format

Configure cache path

Get current file

Easily invalidate cache

Do stuff after knitting

Scripts to and from RMarkdown

Related

Make chunk options non-optional

Captions using text references

Default caption

Save plots in multiple formats

Exit prematurely

Get the output format

Configure cache path

Get current file

Easily invalidate cache

Do stuff after knitting

Scripts to and from RMarkdown

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)