More Readable Code with Pipes in R

July 30, 2014
By

(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers)

Several blog posts have made mention of the 'magrittr' package which allows functional arguments to be passed to functions in a pipes style fashion (David Smith ).

This stylistic option has several advantages:
 
1. Reduced requirements of nested parenthesizes
2. Order of functional operations now read from left to right
3. Organizational style of the code may be improved

The library uses a new operator %>% which basically tells R to take the value of that which is to the left and pass it to the right as an argument. Let us see this in action with some text functions.




require('magrittr')
 
# Let's play with some strings
 
str1 = "A scratch? Your arm's off."
str2 = "I've had worse."
 
str1 %>% substr(3,9)  
#[1]Evaluates to "scratch"
 
str1 %>% strsplit('?',fixed=TRUE)
#[[1]]
#[1] "A scratch"        " Your arm's off."
 
# Pipes can be chained as well
str1 %>% paste(str2) %>% toupper()
# [1] "A SCRATCH? YOUR ARM'S OFF. I'VE HAD WORSE."
 
# Let's see how pipes might work with drawing random variables
 
# I like to define a function that allows an element by element maximization
 
vmax <- function(x, maximum=0) x %>% cbind(0) %>% apply(1, max)
-5:5 %>% vmax
# [1] 0 0 0 0 0 0 1 2 3 4 5
 
# This is identical to defining the function as:
vmax <- function(x, maximum=0) apply(cbind(x,0), 1, max)
vmax(-5:5)
 
# Notice that the latter formation uses the same number of parenthsize
# and be more readable.
 
# However recently I was drawing data for a simulation in which I wanted to
# draw Nitem values from the quantiles of the normal distribution, censor the
# values at 0 and then randomize their order.
 
Nitem  <- 100
ctmean <- 1
ctsd   <- .5
 
draws <- seq(0, 1, length.out = Nitem+2)[-c(1,Nitem+2)] %>%
         qnorm(ctmean,ctsd) %>% vmax %>% sample(Nitem)
 
# While this looks ugly, let's see how worse it would have been without pipes
draws <- sample(vmax(qnorm(seq(0, 1, length.out = Nitem+2)[-c(1,Nitem+2)]
                  ,ctmean,ctsd)),Nitem)
 
# Both functional sequences are ugly though I think I prefer the first which
# I can easily read as seq is passed to qnorm passed to vmax passed to sample
 
# A few things to note with the %>% operator. If you want to send the value to
# an argument which is not the first or is a named value, use the '.'
 
mydata <- seq(0, 1, length.out = Nitem+2)[-c(1,Nitem+2)] %>%
          qnorm(ctmean,ctsd) %>% vmax %>% sample(Nitem) %>%
          data.frame(index = 1:Nitem , theta = .)
 
# Also not that the operator is not as slow as you might think it should be.
# Thus:
 
1 + 8 %>% sqrt
# Returns 3.828427
 
# Rather than
(1 + 8) %>% sqrt
# [1] 3
Created by Pretty R at inside-R.org

To leave a comment for the author, please follow the link and comment on his blog: Econometrics by Simulation.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.