How to make a reprex

[This article was first published on Articles - The Analyst Code, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

What is a reprex?

A reprex is a REPRoducible EXample. Although frequently referred to as a reprex, what we desire is more properly called a minimum reproducible example.

This is example code. The purpose of a reprex is to illustrate a particular outcome. The reader should know what outcome to expect. The reprex should do that one thing, and only that thing.

This is reproducible code. The reprex should be able to be run by any user who so wishes and return the exact same result.

This is minimal code. It contains only the functions that are absolutely necessary to illustrate the point. If this is in the context of data analysis, the included dataset is as small as possible.

Why would we need a reprex?

  1. To demonstrate an issue or bug in our code

  2. To demonstrate a solution to an issue or bug in someone else’s code

  3. To document a useful snippet of code for future reference

What are the benefits of a good reprex?

  1. Squash your bugs

    • Creating a minimum example often helps you to isolate the core issue yourself. Frequently, the very process of making a reprex helps find a solution for the bug since a reprex, at its core, is clear, concise, and clean code!

    • If you can’t solve it yourself, creating a reprex will help your friends help you. I had one coworker who would frequently post screenshots of his RStudio session and ask for help. No one wants to volunteer to transcribe the picture into runable code, especially when the code begins with a custom dataset that the rest of us cannot access. On the other hand, many people love a 10-line puzzle: “This code says my average test score was 9600. I was expecting a number between 0 and 100. What did I do wrong? [insert reprex here]”

  2. Teach others

    • Coding is collaborative. You will find that the online communities like StackOverflow are always willing to help and teach, provided you ask politely and provide a clean reprex. One of the joys of the profession is that you are always learning and, conversely, are always able to help others learn, too

    • When someone asks for help, one of the best ways to demonstrate a solution to their problem is through a reprex. Is a friend trying to understand the t.test function? Send him a working example so he can see firsthand how the function is built to be used and tinker with it

  3. Save code for future use

    • There was a period of time when I was learning ggplot2 that I built a long how-to guide akin to the BBC’s R Cookbook. Every time I needed to build a new chart for my work, I added it to my guide using a simple dataset. Before too long, I had a reference tool with an example chart with my preferred colors and formatting. Each example chart was its own reprex for my future self to reference

What makes a good reprex?

  • The reprex tells the user what you expect to see and what you actually saw instead

# Expect this code to color my line chart shades of blue
# Lines still have the default color scheme

# [Insert code here]

  • The reprex will return the same result you saw (if using random data, then set.seed() so that it is reproducible)

set.seed(1)
coin_flips <- sample(c(0, 1), 100, replace = TRUE)

  • The reprex runs after a simple copy-paste (doesn’t require downloading datasets from online, uncommenting lines, etc.)
  • The reprex has no side effects on the user’s computer (doesn’t setwd, rm(list = ls(), download unknown files from the internet)
  • The reprex specifies all necessary package calls at the beginning with library(xxx) and/or namespaces functions (dplyr::select(xxx))
  • The reprex prefers mtcars or iris over a custom data table. If neither of those suffices, the reprex includes code for creating example data

sample_data <- tibble::tibble(
  DATES = seq(
    from = as.Date("2020-01-01"),
    to = as.Date("2020-01-31"),
    by = "day"
  ),
  VALUES = seq(from = 1, to = 31, by = 1)
)

  • The reprex has useful variable names and comments

# I expect the two calculations to be identical, but this returns 5 rows!
mtcars %>%
  dplyr::mutate(
    BASE_R_CALC = ...,
    DPLYR_CALC = ...
  ) %>%
  dplyr::filter(BASE_R_CALC != DPLYR_CALC)

  • The reprex uses good coding style

Enhance your reprex using the reprex package

If you install.packages(“reprex”) you can use a function, reprex() to generate clean example code, with commented output. Of course, you need to create the minimal code yourself, but once you have it, calling the reprex function adds in the output so that others can see exactly what your computer returned. It’s a minor change but can make all the difference to someone who wants to help you.

Compare this reprex


dplyr::filter(mtcars, mpg > 20 & cyl == 6)

to this one.


dplyr::filter(mtcars, mpg > 20 & cyl == 6)
#>                 mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4      21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

The first reprex above is as you would type it in R. It is short and executable. In the second, I copied the reprex and then ran reprex::reprex() in the console. That function executed my code and put a clean reprex--with commented output--on my clipboard for me to paste elsewhere. The result is short, executable, and highly informative.

The reprex package has a few additional handy features. It can append session info if your bug has to do with your versions of R and packages (reprex(..., session_info = TRUE)). It can specify the markup style (reprex(..., venue = “so”) for Stack Overflow, venue = “gh” for GitHub, venue = “ds” for Discourse). It can save figure outputs to imgur.com.

Where can I learn more?

The reprex package’s do’s and don’ts page

Hadley Wickham’s Advanced R book

This Stack Overflow thread

To leave a comment for the author, please follow the link and comment on their blog: Articles - The Analyst Code.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)