# Easier way to chain commands using Pipe function

August 15, 2014
By

[This article was first published on The blog of Kun Ren, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In pipeR 0.4 version, one of the new features is `Pipe()` function. The function basically creates a Pipe object that allows command chaining with `\$`, and thus makes it easier to perform operations in pipeline without any external operator.

In this post, I will introduce how to use this function and some basic knowledge about how it works. But before that, I would like to make clear that you don't have to learn a whole new thing if you are familiar with magrittr's `%>%` operator or pipeR's `%>>%` operator. If you are not, you can go ahead without hesitation. After all, the tools are made to be easier to work with.

## Introducing `Pipe()`

Consider a task we plot the log differences of 100 normally distributed random numbers with mean 10. The traditional code can be written as

``````plot(diff(log(rnorm(100, mean = 10))),col = "red")
``````

magrittr's `%>%` and pipeR's `%>>%` are designed to chain these commands in a human readable way. With `%>%` operator, the code can be restructured like

``````library(magrittr)
rnorm(100, mean = 10) %>%
log %>%
diff %>%
plot(col="red")
``````

In this case, `%>%` and `%>>%` are interchangeable which produce similar output. The operator does nothing special but hack the expression so that the left-hand side object is inserted into the function call on the right-hand side of the operator.

``````library(pipeR)
rnorm(100, mean = 10) %>>%
log %>>%
diff %>>%
plot(col="red")
``````

From the examples above, it seems that `%>%` and `%>>%` are exactly the same. In fact, they are not. I wrote an article Difference between magrittr and pipeR to explain their differences.

Both operators can solve the problem above by building a pipeline to avoid deeply nested code and make the operations readable. But is there an even easier way? The answer is Yes.

With `Pipe()` function introduced in pipeR 0.4, the code can be more simplified, even without any weird user-defined operator that has to be enclosed by `% %`. It goes like

``````library(pipeR)
Pipe(rnorm(100, mean = 10))\$
log()\$
diff()\$
plot(col="red")
``````

You may have noticed that the pipeline starts with `Pipe()` function. This function basically creates a Pipe object which, in essence, is an environment which stores a value and whose `\$` is specially defined to perform first-argument piping. If a function name that follows `\$` is called, then the resulted value will be stored in the next-level Pipe object.

``````Pipe(c(1,2,3))\$
mean()
``````
``````\$value : numeric
------
[1] 2
``````

Note that the output indicates that the result is not a simple numeric vector but a box that contains that numeric vector as an element `\$value`.

To see the difference, try to run

``````Pipe(c(1,2,3))\$mean() + 1
``````
``````Error: non-numeric argument to binary operator
``````

If the pipeline returns a numeric value `2`, it should add 1 and return 3 as a result. Clearly, this is not the case. It is the box containing the value that allows `\$` to perform more levels of piping. In fact, The pipeline construction does not stop until the value is extracted by `\$value`.

``````Pipe(c(1,2,3))\$
mean()\$
value
``````
``````[1] 2
``````

or simply `[]` as a shortcut.

``````Pipe(c(1,2,3))\$
mean() []
``````
``````[1] 2
``````

Once the value is extracted from the box (or Pipe environment), the pipeline is ended with the stored value returned.

Having known these features, `Pipe()` function can be used to work with pipeline-friendly packages such as dplyr, ggvis, and rlist. Here are some simple examples.

`Pipe()` works with dplyr functions.

``````library(dplyr)
Pipe(mtcars)\$
filter(mpg <= mean(mpg))\$
select(mpg, cyl, wt)\$
group_by(cyl)\$
do(Pipe(.)\$
arrange(wt)\$
head(1)\$
value)\$
value
``````
``````Source: local data frame [2 x 3]
Groups: cyl

mpg cyl   wt
1 19.7   6 2.77
2 15.8   8 3.17
``````

`Pipe()` works with ggvis.

``````library(ggvis)
Pipe(mtcars)\$
ggvis(~ mpg, ~ wt)\$
layer_points()\$
layer_smooths()
``````

`Pipe()` also works with rlist.

``````library(rlist)
Pipe(1:10)\$
list.filter(x ~ x <= 5)\$
list.mapv(letters[.])
``````
``````\$value : character
------
[1] "a" "b" "c" "d" "e"
``````

## More features

As I mentioned in Introducing pipeR 0.4, pipeR's `%>>%` operator is able to

• Pipe left-hand side object as the first argument to the right-hand side function name or call;
• Pipe as `.` within `{}` or by lambda expression within `()`;
• Extract element when followed by a name enclosed by `()` (new feature in version 0.4-1).

The same features are supported with `.()` function used with `Pipe()`. For example,

``````Pipe(mtcars)\$
.(lm(mpg ~ cyl + wt, data = .))\$
summary()\$
.(coefficients)
``````
``````\$value : matrix
------
Estimate Std. Error t value  Pr(>|t|)
(Intercept)   39.686     1.7150  23.141 3.043e-20
cyl           -1.508     0.4147  -3.636 1.064e-03
wt            -3.191     0.7569  -4.216 2.220e-04
``````

You can regard the above code as evaluated in the following steps:

``````m <- lm(mpg ~ cyl + wt, data = mtcars)
msum <- summary(m)
msum\$coefficients
``````

A noteworthy difference between the results produced by the two cases is that the final result produced by `Pipe()` is still stored in the Pipe object (the box), and you can extract the value or build longer pipeline with it. For example,

``````model <- Pipe(mtcars)\$
.(lm(mpg ~ cyl + wt, data = .))
``````

Then `model` is a Pipe object in which the value is a linear model and can be used for further piping.

``````model\$summary()\$.(r.squared)
``````
``````\$value : numeric
------
[1] 0.8302
``````
``````model\$predict(list(cyl = 6, wt = 2.9))
``````
``````\$value : numeric
------
1
21.39
``````

Another interesting feature of Pipe object is about creating easy-to-use closures (roughly, a function created runtime within a context). For example, we can create a closure that generates 10 uniformly distributed numbers but its range is undecided.

``````rnd <- Pipe(10)\$runif
``````

A function `rnd(...)` has been created an it can be used to generate 10 uniformly distributed random numbers with different settings of range.

``````rnd(min = 1, max = 2)
``````
``````\$value : numeric
------
[1] 1.258 1.552 1.056 1.469 1.484 1.812 1.370 1.547 1.170 1.625
``````
``````rnd(min = 10, max = 20)
``````
``````\$value : numeric
------
[1] 18.82 12.80 13.98 17.63 16.69 12.05 13.58 13.59 16.90 15.36
``````

## Performance

The overhead of `Pipe()` function is very low. Its performance is very close to `%>>%`. In intensive iterations, using `Pipe()` may also save some time. For more details, see pipeR's vignette Performance.

## Conclusion

While `%>%` and `%>>%` implements operator-based pipeline like in F#, `Pipe()` function implements an object-like pipeline mechanism like the implementation in jQuery in JavaScript and LINQ in C#.

It dynamically creates closures as if the object had the child function to operate with it. It is more light-weight and easier to type than operator approach especially in R which requires user-defined operators take a name enclosed by `% %`.

If you like this idea, just install pipeR with

``````install.packages("pipeR")
``````

and try `Pipe()`.

To leave a comment for the author, please follow the link and comment on their blog: The blog of Kun Ren.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)