Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R‘s upcoming pipe appears to be currently proposed as a syntactic transform of the form:

  a |> f(...)   ->    f(a, ...)
a |> f()      ->    f(a)


There is a current active discussion on this prototype and some interesting points come up. Note the current proposal appears to disallow a |> f -> f(a), a currently popular transform.

1. This is a language feature presented as a soon-to-be-user-visible prototype, not an RFC.
2. Some are objecting to the term “pipe.”
3. Some call this sort of pipe function composition.
4. It is noticed that this sort of substitution is generally thought of as a “macro.”
5. There is a claim the proposed pipe seems to violate the beta-reduction rule of the lambda calculus: variables should be substitutable for values. The idea is if the following code fragment is allowed.
  f <- function(x) { x + 1 }
2 |> f()


Then replacing f with its value should also be valid. And one might even want a strong substitution of expressions, and be able to write:

  2 |> function(x) { x + 1 }


It appears the current R-language |> operator does not allow the second expression unless extra parenthesis are introduced (either to group the function declaration terms or add on an argument evaluation slot). I haven’t tried this, so I may be wrong, I am attempt to excerpt this from the dev email chain.

1. Point 1 seems a bit too pragmatic for core language features.
2. The word “pipe” is used in many languages to mean something other than the Unix pipe. It is lore that “Unix pipe is the only pipe” bullying is why R’s magrittr package, the supplier of the popular R-extension pipe, is named magrittr.
3. There is a strong analogy to function composition, but there are some details that relate this strongly to function application or even macro application.
4. Macros or marcro-like entities in R are likely a bigger problem than one might expect. What R calls “functions” are closures (they capture environments) with essentially Lisp FEXPR semantics. These forms were largely abandoned in later Lisps in favor of a split into functions that work in applicative order (arguments are evaluated before the function is evaluated) and macros (roughly transformations on code). According to the Wikipedia, Kent Pitman argued in 1980 that once you have macros, FEXPRs become hard to defend.
5. Point 5 probably doesn’t matter to the end users. The popular R pipe magrittr doesn’t allow this form either. Also, such an objection may be confusing substitution of expressions with substitution of values. One doesn’t expect to substitute the expression side of x <- 1 + 2 into x * 3 without some extra parenthesis. However, I feel there are likely some important points of this form that have not been discussed in a large enough venue at this time. We may all be missing something if we don’t listen to feed back such as this.
6. A lot of the issue is: R FEXPRs get their arguments un-evaluated. One can use this to implement a lot of language features (control structures, domain specific mini-languages) at the user or package level. R packages really do feel like R extensions. However, FEXPRs don’t receive their arguments un-parsed, so some things are not possible at this level.
7. One of the objections to package supplied pipes is the requirement of verbose user-specified infix operators that start and end with “%“. This is why magrittr’s pipe is written as “%>%” (there are also some issues of operator precedence, but they are minor and fixed by the occasional introduction of parenthesis). data.table essentially uses “][” as a pipe operator without needing any syntactic hooks, instead relying on the self-return conventions of “[]“.

My conclusions/opinions are the following.

• A syntactic transform of the type being proposed can only be done in the core language. So some variation of the core-R proposal likely has value.
• I strongly prefer place-holders. I think it is a much more powerful convention and avoids needing to introduce lambda in many places. I think Scala uses something like this to great success. There are also great advantages of being able to pipe into expressions instead of just functions. However, I understand the base-R pipe is not my pipe, so having expressed my preference I am willing to move on.
• The community would have loved an RFC on this. The new pipe has been presented as part of the 2020 use-R conference, and announced, but comments really are not being solicited (I know that strongly includes this note).
• It would make sense to supply a second infix operator that is unbound, so packages that supply a pipe can use it as an alternate notation. If the base-R pipe’s only advantage over a package such as magrittr were going to be the notation is “|>” instead of “%>%“, then give magittr (and other packages) an additional symbol of similar quality. The unbound assignment operator “:=” is already used to great advantage in the data.table, dplyr, and wrapr packages. I ask: make some infix operator (with appropriate precedence) to the packages. Possibilities might include: “=>“, “*>“, “:>“, “:]“, “|]“, or some other current syntactically invalid fragment. I know I would love such for my own packages. Keep the better operator for base-R, but please give the packages a nice-ish one also.

All the packages could use the same new pipe symbol, and users pick which one by what package they attach. Obviously the popular choice would be magrittr, but as long as the symbol is equally available to all developers that feels fair.

And those are my very unsolicited thoughts on the new R pipe. I admit: I have dog in the race, my own pipe that I use in my own work. I feel in addition to possibly coloring my opinions (which I am trying to be careful about) that also gives me some relevant experience. I’ve already inserted one message into the R-dev email chain, so I am trying to limit my comments to this blog which is less of an imposition on R-dev subscribers.