**The blog of Kun Ren**, and kindly contributed to R-bloggers)

In pipeR 0.4 version, one of the new features is `Pipe()`

function. The function basically creates a Pipe object that allows command chaining with `$`

, and thus makes it easier to perform operations in pipeline without any external operator.

In this post, I will introduce how to use this function and some basic knowledge about how it works. But before that, I would like to make clear that you don't have to learn a whole new thing if you are familiar with magrittr's `%>%`

operator or pipeR's `%>>%`

operator. If you are not, you can go ahead without hesitation. After all, the tools are made to be easier to work with.

## Introducing `Pipe()`

Consider a task we plot the log differences of 100 normally distributed random numbers with mean 10. The traditional code can be written as

```
plot(diff(log(rnorm(100, mean = 10))),col = "red")
```

magrittr's `%>%`

and pipeR's `%>>%`

are designed to chain these commands in a human readable way. With `%>%`

operator, the code can be restructured like

```
library(magrittr)
rnorm(100, mean = 10) %>%
log %>%
diff %>%
plot(col="red")
```

In this case, `%>%`

and `%>>%`

are interchangeable which produce similar output. The operator does nothing special but hack the expression so that the left-hand side object is inserted into the function call on the right-hand side of the operator.

```
library(pipeR)
rnorm(100, mean = 10) %>>%
log %>>%
diff %>>%
plot(col="red")
```

From the examples above, it seems that `%>%`

and `%>>%`

are exactly the same. In fact, they are not. I wrote an article *Difference between magrittr and pipeR* to explain their differences.

Both operators can solve the problem above by building a pipeline to avoid deeply nested code and make the operations readable. But is there an even easier way? The answer is Yes.

With `Pipe()`

function introduced in pipeR 0.4, the code can be more simplified, even without any weird user-defined operator that has to be enclosed by `% %`

. It goes like

```
library(pipeR)
Pipe(rnorm(100, mean = 10))$
log()$
diff()$
plot(col="red")
```

You may have noticed that the pipeline starts with `Pipe()`

function. This function basically creates a Pipe object which, in essence, is an environment which stores a value and whose `$`

is specially defined to perform first-argument piping. If a function name that follows `$`

is called, then the resulted value will be stored in the next-level Pipe object.

```
Pipe(c(1,2,3))$
mean()
```

```
$value : numeric
------
[1] 2
```

Note that the output indicates that the result is not a simple numeric vector but *a box* that contains that numeric vector as an element `$value`

.

To see the difference, try to run

```
Pipe(c(1,2,3))$mean() + 1
```

```
Error: non-numeric argument to binary operator
```

If the pipeline returns a numeric value `2`

, it should add 1 and return 3 as a result. Clearly, this is not the case. It is the box containing the value that allows `$`

to perform more levels of piping. In fact, The pipeline construction does not stop until the value is extracted by `$value`

.

```
Pipe(c(1,2,3))$
mean()$
value
```

```
[1] 2
```

or simply `[]`

as a shortcut.

```
Pipe(c(1,2,3))$
mean() []
```

```
[1] 2
```

Once the value is extracted from the box (or Pipe environment), the pipeline is ended with the stored value returned.

Having known these features, `Pipe()`

function can be used to work with pipeline-friendly packages such as dplyr, ggvis, and rlist. Here are some simple examples.

`Pipe()`

works with dplyr functions.

```
library(dplyr)
Pipe(mtcars)$
filter(mpg <= mean(mpg))$
select(mpg, cyl, wt)$
group_by(cyl)$
do(Pipe(.)$
arrange(wt)$
head(1)$
value)$
value
```

```
Source: local data frame [2 x 3]
Groups: cyl
mpg cyl wt
1 19.7 6 2.77
2 15.8 8 3.17
```

`Pipe()`

works with ggvis.

```
library(ggvis)
Pipe(mtcars)$
ggvis(~ mpg, ~ wt)$
layer_points()$
layer_smooths()
```

`Pipe()`

also works with rlist.

```
library(rlist)
Pipe(1:10)$
list.filter(x ~ x <= 5)$
list.mapv(letters[.])
```

```
$value : character
------
[1] "a" "b" "c" "d" "e"
```

## More features

As I mentioned in *Introducing pipeR 0.4*, pipeR's `%>>%`

operator is able to

- Pipe left-hand side object as the first argument to the right-hand side function name or call;
- Pipe as
`.`

within`{}`

or by lambda expression within`()`

; - Extract element when followed by a name enclosed by
`()`

(new feature in version 0.4-1).

The same features are supported with `.()`

function used with `Pipe()`

. For example,

```
Pipe(mtcars)$
.(lm(mpg ~ cyl + wt, data = .))$
summary()$
.(coefficients)
```

```
$value : matrix
------
Estimate Std. Error t value Pr(>|t|)
(Intercept) 39.686 1.7150 23.141 3.043e-20
cyl -1.508 0.4147 -3.636 1.064e-03
wt -3.191 0.7569 -4.216 2.220e-04
```

You can regard the above code as evaluated in the following steps:

```
m <- lm(mpg ~ cyl + wt, data = mtcars)
msum <- summary(m)
msum$coefficients
```

A noteworthy difference between the results produced by the two cases is that the final result produced by `Pipe()`

is still stored in the Pipe object (the box), and you can extract the value or build longer pipeline with it. For example,

```
model <- Pipe(mtcars)$
.(lm(mpg ~ cyl + wt, data = .))
```

Then `model`

is a Pipe object in which the value is a linear model and can be used for further piping.

```
model$summary()$.(r.squared)
```

```
$value : numeric
------
[1] 0.8302
```

```
model$predict(list(cyl = 6, wt = 2.9))
```

```
$value : numeric
------
1
21.39
```

Another interesting feature of Pipe object is about creating easy-to-use closures (roughly, a function created runtime within a context). For example, we can create a closure that generates 10 uniformly distributed numbers but its range is undecided.

```
rnd <- Pipe(10)$runif
```

A function `rnd(...)`

has been created an it can be used to generate 10 uniformly distributed random numbers with different settings of range.

```
rnd(min = 1, max = 2)
```

```
$value : numeric
------
[1] 1.258 1.552 1.056 1.469 1.484 1.812 1.370 1.547 1.170 1.625
```

```
rnd(min = 10, max = 20)
```

```
$value : numeric
------
[1] 18.82 12.80 13.98 17.63 16.69 12.05 13.58 13.59 16.90 15.36
```

## Performance

The overhead of `Pipe()`

function is very low. Its performance is very close to `%>>%`

. In intensive iterations, using `Pipe()`

may also save some time. For more details, see pipeR's vignette Performance.

## Conclusion

While `%>%`

and `%>>%`

implements operator-based pipeline like in F#, `Pipe()`

function implements an object-like pipeline mechanism like the implementation in jQuery in JavaScript and LINQ in C#.

It dynamically creates closures as if the object had the child function to operate with it. It is more light-weight and easier to type than operator approach especially in R which requires user-defined operators take a name enclosed by `% %`

.

If you like this idea, just install pipeR with

```
install.packages("pipeR")
```

and try `Pipe()`

.

**leave a comment**for the author, please follow the link and comment on their blog:

**The blog of Kun Ren**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...