Using mutate from dplyr inside a function: getting around non-standard evaluation

[This article was first published on NumberTheory » R stuff, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

To edit or add columns to a data.frame, you can use mutate from the dplyr package:

library(dplyr)
mtcars %>% mutate(new_column = mpg + wt)

Here, dplyr uses non-standard evaluation in finding the contents for mpg and wt, knowing that it needs to look in the context of mtcars. This is nice for interactive use, but not so nice for using mutate inside a function where mpg and wt are inputs to the function.

The goal is to write a function f that takes the columns in mtcars you want to add up as strings, and executes mutate. Note that we also want to be able to set the new column name. A first naive approach might be:

f = function(col1, col2, new_col_name) {
    mtcars %>% mutate(new_col_name = col1 + col2)
}

The problem is that col1 and col2 are not interpreted, in stead dplyr tries looking for col1 and col2 in mtcars. In addition, the name of the new column will be new_col_name, and not the content of new_col_name. To get around non-standard evaluation, you can use the lazyeval package. The following function does what we expect:

library(lazyeval)
f = function(col1, col2, new_col_name) {
    mutate_call = lazyeval::interp(~ a + b, a = as.name(col1), b = as.name(col2))
    mtcars %>% mutate_(.dots = setNames(list(mutate_call), new_col_name))
}
head(f('wt', 'mpg', 'hahaaa'))
   mpg cyl disp  hp drat    wt  qsec vs am gear carb hahaaa
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 23.620
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 23.875
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 25.120
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 24.615
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 22.140
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 21.560

The important parts here are, given the call to f above:

  • lazyeval::interp(~ a + b, a = as.name(col1), b = as.name(col2)) this creates the expression wt + mpg.
  • mutate_(mutate_call) where mutate_ is the version of mutate that uses standard evaluation (SE).
  • setNames(list(mutate_call), new_col_name)) sets the output name to the content of new_col_name, i.e. hahaaa.

To leave a comment for the author, please follow the link and comment on their blog: NumberTheory » R stuff.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)