More Readable Code with Pipes in R

July 30, 2014

(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers)

Several blog posts have made mention of the ‘magrittr’ package which allows functional arguments to be passed to functions in a pipes style fashion (David Smith ).

This stylistic option has several advantages:
1. Reduced requirements of nested parenthesizes
2. Order of functional operations now read from left to right
3. Organizational style of the code may be improved

The library uses a new operator %>% which basically tells R to take the value of that which is to the left and pass it to the right as an argument. Let us see this in action with some text functions.

# Let's play with some strings
str1 = "A scratch? Your arm's off."
str2 = "I've had worse."
str1 %>% substr(3,9)  
#[1]Evaluates to "scratch"
str1 %>% strsplit('?',fixed=TRUE)
#[1] "A scratch"        " Your arm's off."
# Pipes can be chained as well
str1 %>% paste(str2) %>% toupper()
# Let's see how pipes might work with drawing random variables
# I like to define a function that allows an element by element maximization
vmax <- function(x, maximum=0) x %>% cbind(0) %>% apply(1, max)
-5:5 %>% vmax
# [1] 0 0 0 0 0 0 1 2 3 4 5
# This is identical to defining the function as:
vmax <- function(x, maximum=0) apply(cbind(x,0), 1, max)
# Notice that the latter formation uses the same number of parenthsize
# and be more readable.
# However recently I was drawing data for a simulation in which I wanted to
# draw Nitem values from the quantiles of the normal distribution, censor the
# values at 0 and then randomize their order.
Nitem  <- 100
ctmean <- 1
ctsd   <- .5
draws <- seq(0, 1, length.out = Nitem+2)[-c(1,Nitem+2)] %>%
         qnorm(ctmean,ctsd) %>% vmax %>% sample(Nitem)
# While this looks ugly, let's see how worse it would have been without pipes
draws <- sample(vmax(qnorm(seq(0, 1, length.out = Nitem+2)[-c(1,Nitem+2)]
# Both functional sequences are ugly though I think I prefer the first which
# I can easily read as seq is passed to qnorm passed to vmax passed to sample
# A few things to note with the %>% operator. If you want to send the value to
# an argument which is not the first or is a named value, use the '.'
mydata <- seq(0, 1, length.out = Nitem+2)[-c(1,Nitem+2)] %>%
          qnorm(ctmean,ctsd) %>% vmax %>% sample(Nitem) %>%
          data.frame(index = 1:Nitem , theta = .)
# Also not that the operator is not as slow as you might think it should be.
# Thus:
1 + 8 %>% sqrt
# Returns 3.828427
# Rather than
(1 + 8) %>% sqrt
# [1] 3

To leave a comment for the author, please follow the link and comment on their blog: Econometrics by Simulation. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)