What is a reprex?
A reprex is a REPRoducible EXample. Although frequently referred to as a reprex, what we desire is more properly called a minimum reproducible example.
This is example code. The purpose of a reprex is to illustrate a particular outcome. The reader should know what outcome to expect. The reprex should do that one thing, and only that thing.
This is reproducible code. The reprex should be able to be run by any user who so wishes and return the exact same result.
This is minimal code. It contains only the functions that are absolutely necessary to illustrate the point. If this is in the context of data analysis, the included dataset is as small as possible.
Why would we need a reprex?
To demonstrate an issue or bug in our code
To demonstrate a solution to an issue or bug in someone else’s code
To document a useful snippet of code for future reference
What are the benefits of a good reprex?
Squash your bugs
Creating a minimum example often helps you to isolate the core issue yourself. Frequently, the very process of making a reprex helps find a solution for the bug since a reprex, at its core, is clear, concise, and clean code!
If you can’t solve it yourself, creating a reprex will help your friends help you. I had one coworker who would frequently post screenshots of his RStudio session and ask for help. No one wants to volunteer to transcribe the picture into runable code, especially when the code begins with a custom dataset that the rest of us cannot access. On the other hand, many people love a 10-line puzzle: “This code says my average test score was 9600. I was expecting a number between 0 and 100. What did I do wrong? [insert reprex here]”
Coding is collaborative. You will find that the online communities like StackOverflow are always willing to help and teach, provided you ask politely and provide a clean reprex. One of the joys of the profession is that you are always learning and, conversely, are always able to help others learn, too
When someone asks for help, one of the best ways to demonstrate a solution to their problem is through a reprex. Is a friend trying to understand the
t.testfunction? Send him a working example so he can see firsthand how the function is built to be used and tinker with it
Save code for future use
- There was a period of time when I was learning
ggplot2that I built a long how-to guide akin to the BBC’s R Cookbook. Every time I needed to build a new chart for my work, I added it to my guide using a simple dataset. Before too long, I had a reference tool with an example chart with my preferred colors and formatting. Each example chart was its own reprex for my future self to reference
- There was a period of time when I was learning
What makes a good reprex?
- The reprex tells the user what you expect to see and what you actually saw instead
# Expect this code to color my line chart shades of blue # Lines still have the default color scheme # [Insert code here]
- The reprex will return the same result you saw (if using random data, then
set.seed()so that it is reproducible)
set.seed(1) coin_flips <- sample(c(0, 1), 100, replace = TRUE)
- The reprex runs after a simple copy-paste (doesn’t require downloading datasets from online, uncommenting lines, etc.)
- The reprex has no side effects on the user’s computer (doesn’t
rm(list = ls(), download unknown files from the internet)
- The reprex specifies all necessary package calls at the beginning with
library(xxx)and/or namespaces functions (
- The reprex prefers
irisover a custom data table. If neither of those suffices, the reprex includes code for creating example data
sample_data <- tibble::tibble( DATES = seq( from = as.Date("2020-01-01"), to = as.Date("2020-01-31"), by = "day" ), VALUES = seq(from = 1, to = 31, by = 1) )
- The reprex has useful variable names and comments
# I expect the two calculations to be identical, but this returns 5 rows! mtcars %>% dplyr::mutate( BASE_R_CALC = ..., DPLYR_CALC = ... ) %>% dplyr::filter(BASE_R_CALC != DPLYR_CALC)
- The reprex uses good coding style
Enhance your reprex using the reprex package
install.packages(“reprex”) you can use a function,
reprex() to generate clean example code, with commented output. Of course, you need to create the minimal code yourself, but once you have it, calling the
reprex function adds in the output so that others can see exactly what your computer returned. It’s a minor change but can make all the difference to someone who wants to help you.
Compare this reprex
dplyr::filter(mtcars, mpg > 20 & cyl == 6)
to this one.
dplyr::filter(mtcars, mpg > 20 & cyl == 6) #> mpg cyl disp hp drat wt qsec vs am gear carb #> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 #> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 #> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
The first reprex above is as you would type it in R. It is short and executable. In the second, I copied the reprex and then ran
reprex::reprex() in the console. That function executed my code and put a clean reprex--with commented output--on my clipboard for me to paste elsewhere. The result is short, executable, and highly informative.
The reprex package has a few additional handy features. It can append session info if your bug has to do with your versions of R and packages (
reprex(..., session_info = TRUE)). It can specify the markup style (
reprex(..., venue = “so”) for Stack Overflow,
venue = “gh” for GitHub,
venue = “ds” for Discourse). It can save figure outputs to imgur.com.
Where can I learn more?
The reprex package’s do’s and don’ts page
Hadley Wickham’s Advanced R book
This Stack Overflow thread