# Piping is Method Chaining

**R – Win-Vector Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

What `R`

users now call piping, popularized by Stefan Milton Bache and Hadley Wickham, is inline function application (this is notationally similar to, but distinct from the powerful interprocess communication and concurrency tool introduced to Unix by Douglas McIlroy in 1973). In object oriented languages this sort of notation for function application has been called “method chaining” since the days of `Smalltalk`

(~1972). Let’s take a look at method chaining in `Python`

, in terms of pipe notation.

Let’s work an example using `Python`

‘s `Pandas`

package (and classes).

<span class="im">import</span> pandas <span class="im">as</span> pd

data = [['alpha', 'a', 1, 0], ['beta', 'b', 2, 10], ['gamma', 'b', 3, 10]] df = pd.DataFrame(data, columns=['name', 'group', 'value', 'cost']) print(df)

name group value cost 0 alpha a 1 0 1 beta b 2 10 2 gamma b 3 10

Method chaining is when methods return a reference to their host-object (or reference to a replacement for their host-object). This lets us call a sequence of methods one after the other as we show below.

<span class="bu">print</span>(df.groupby(<span class="st">"group"</span>).agg({<span class="st">"value"</span>:[<span class="st">"max"</span>, <span class="st">"min"</span>], <span class="st">"cost"</span>:[<span class="st">"mean"</span>]}))

value cost max min mean group a 1 1 0 b 3 2 10

This may not be considered legible (especially as it was combined with `print()`

function notation), so we use a common notation convention and insert a line-break before each method dispatch “`.`

“. The parenthesis surrounding the whole expression are a common `Python`

convention to facilitate multi-line expressions.

( df .groupby("group") .agg({"value":["max", "min"], "cost":["mean"]}) .pipe(print) )

value cost max min mean group a 1 1 0 b 3 2 10

Or, to emphasize the similarity to pipes, we can use another convention (that contravenes the PEP8 style guide): end the lines with `.\`

which is the method dispatch “`.`

” symbol plus a line continuation mark.

df .\ groupby("group") .\ agg({"value":["max", "min"], "cost":["mean"]}) .\ pipe(print)

value cost max min mean group a 1 1 0 b 3 2 10

The above is just as with the `Bizarro Pipe`

in `R`

: the pipe is available as a convention over the existing language syntax. In `Python`

(for method chaining enabled classes and methods) the glyph “`.`

” is in fact already a method application operator or pipe (as is the glyph “`.\EOL`

“, where `EOL`

denotes the line-break or end of line). With method-chaining conventions the “`.`

” already is “a pipe” organizing method application form left to right without the need for illegible nesting. In `R`

the glyph “`->.;`

” is a function application operator or pipe (which we called the `Bizarro Pipe`

; the `Bizarro Pipe`

is a first-rate pipe, faster than other pipes, and interferes less with debugging than other pipes).

Both languages have had this application capability for a very long time. We are using `Pandas`

and `pipe()`

as our example, but any package that whose methods return a reference to the object being worked on (or a reference to a replacement object) can be treated as a pipe-able object. If the class further implements one function re-director method (such as `pipe()`

) then a lot more becomes practical. Here is another example showing how additional named and unnamed arguments can be handled.

def add_delta_to_column(df, colname, delta): df[colname] = df[colname] + delta return df

df .\ pipe(add_delta_to_column, "cost", 5) .\ groupby("group") .\ agg({"value":["max", "min"], "cost":["mean"]}) .\ pipe(print, "DEBUG1", sep = " | ")

value cost max min mean group a 1 1 5 b 3 2 15 | DEBUG1

So depending on your point of view: “piping is poor-persons’s method chaining” or “method chaining is poor-persons’s piping” (taken from the usual quote comparing objects and closures).

If one wants to go further, there are a number of `Python`

packages adding additional significant piping capabilities (either through notation, operator overloading, or other methods).

- sklearn.pipeline
- Stack overflow notes 1
- Stack overflow notes 2
- dplython
- sspipe
- Tidyverse pipes in Pandas
- chainlearn

And that is piping versus method chaining.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Win-Vector Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.