Parameterizing with bquote

[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

One thing that is sure to get lost in my long note on macros in R is just how concise and powerful macros are. The problem is macros are concise, but they do a lot for you. So you get bogged down when you explain the joke.

Let’s try to be concise.

Below is an extension of an example taken from the Programming with dplyr note.

First let’s load the package and define our symbols that hold names of columns we wish to work with later.

suppressPackageStartupMessages(library("dplyr"))

group_nm <- as.name("am")
num_nm <- as.name("hp")
den_nm <- as.name("cyl")
derived_nm <- as.name(paste0(num_nm, "_per_", den_nm))
mean_nm <- as.name(paste0("mean_", derived_nm))
count_nm <- as.name("count")

Now let’s use rlang to substitute those symbols into a non-trivial dplyr pipeline.

mtcars %>%
  group_by(!!group_nm) %>%
  mutate(!!derived_nm := !!num_nm/!!den_nm) %>%
  summarize(
    !!mean_nm := mean(!!derived_nm),
    !!count_nm := n()
  ) %>%
  ungroup() %>%
  arrange(!!group_nm)
## # A tibble: 2 x 3
##      am mean_hp_per_cyl count
##   <dbl>           <dbl> <int>
## 1     0            22.7    19
## 2     1            23.4    13

The above is very useful, we have just gotten programmatic control of all the symbol names in a pipeline. This is what is needed to wrap such a pipeline in a function and make it parametric re-usable.

The thing is Thomas Lumley’s base::bquote() could achieve this in 2003, and Gregory R. Warnes’ gtools::strmacro() could further automate specifying the automation in 2005.

Lets show that. First we use gtools to build a "bquote() wrapping factory."

library("gtools")

# build a method wrapping macro
bq_wrap <- strmacro(
  FN,
  expr = {
    FN <- function(.data, ...) {
      env = parent.frame()
      mc <- substitute(dplyr::FN(.data = .data, ...))
      mc <- do.call(bquote, list(mc, where = env), envir = env)
      eval(mc, envir = env)
    }
  }
)

Now we use it to wrap some dplyr methods (ignoring non ... options).

# wrap some dplyr methods
bq_wrap(mutate)
bq_wrap(summarize)
bq_wrap(group_by)
bq_wrap(arrange)

At this point we have re-adapted 4 dplyr methods to use bquote() quasiquotation. This is what we mean by strmacro() is a tool to build tools.

And here is the same pipeline again, entirely driven by bquote().

mtcars %>%
  group_by(.(group_nm)) %>%
  mutate(.(derived_nm) := .(num_nm)/.(den_nm)) %>%
  summarize(
    .(mean_nm) := mean(.(derived_nm)),
    .(count_nm) := n()
  ) %>%
  ungroup() %>%
  arrange(.(group_nm))
## # A tibble: 2 x 3
##      am mean_hp_per_cyl count
##   <dbl>           <dbl> <int>
## 1     0            22.7    19
## 2     1            23.4    13

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)