Exploring NSE: enquo, quos and …

[This article was first published on Digital Age Economist on Digital Age Economist, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As one gets more interested in building your own custom functions, you quickly start realising that unless your functions are tidyverse friendly, standardising your code workflow becomes a problem. So, how do you make your customs play well with your favourite tidyverse packages? Our friendly little helpers are going to be enquo and quos. I am going to build a function that calculates the proportion and cumulative proportion of a grouping variable.

suppressPackageStartupMessages(library(dplyr))

prop_count <- function(df, vars){
  vars_col <- enquo(vars)
  
  print(vars_col)
  
  df %>% 
    count(!!vars_col, sort = T) %>% 
    mutate(prop_n = prop.table(n)) %>% 
    mutate(cumsum_n = cumsum(prop_n)) 
}

dplyr::starwars %>% 
  prop_count(homeworld)

## <quosure>
##   expr: ^homeworld
##   env:  000000000C4567B8

## # A tibble: 49 x 4
##    homeworld     n prop_n cumsum_n
##    <chr>     <int>  <dbl>    <dbl>
##  1 Naboo        11 0.126     0.126
##  2 Tatooine     10 0.115     0.241
##  3 <NA>         10 0.115     0.356
##  4 Alderaan      3 0.0345    0.391
##  5 Coruscant     3 0.0345    0.425
##  6 Kamino        3 0.0345    0.460
##  7 Corellia      2 0.0230    0.483
##  8 Kashyyyk      2 0.0230    0.506
##  9 Mirial        2 0.0230    0.529
## 10 Ryloth        2 0.0230    0.552
## # ... with 39 more rows

From the output we can see that quosures are quoted expressions that keep track of an environment or function and we can use the bang bang (!!) to evaluate (or unquote) the columns. What happens when we are looking to get the proportional count of multiple variable?

dplyr::starwars %>% 
  prop_count(homeworld, species)

## Error in prop_count(., homeworld, species): unused argument (species)

We get an error, as the second argument in the function is interpreted as exactly that, a second argument. We want our function to accommodate multiple grouping variables. This is where quos and ... come in. The ellips is analogous to multiple arguments or input.

prop_count <- function(df, ...){
  vars_col <- quos(...)
  
  print(vars_col)
  
  df %>% 
    count(!!!vars_col, sort = T) %>% 
    mutate(prop_n = prop.table(n)) %>% 
    mutate(cumsum_n = cumsum(prop_n)) 
}

dplyr::starwars %>% 
  prop_count(homeworld, species)

## [[1]]
## <quosure>
##   expr: ^homeworld
##   env:  000000000BFAE918
## 
## [[2]]
## <quosure>
##   expr: ^species
##   env:  000000000BFAE918

## # A tibble: 58 x 5
##    homeworld species      n prop_n cumsum_n
##    <chr>     <chr>    <int>  <dbl>    <dbl>
##  1 Tatooine  Human        8 0.0920   0.0920
##  2 Naboo     Human        5 0.0575   0.149 
##  3 <NA>      Human        5 0.0575   0.207 
##  4 Alderaan  Human        3 0.0345   0.241 
##  5 Naboo     Gungan       3 0.0345   0.276 
##  6 Corellia  Human        2 0.0230   0.299 
##  7 Coruscant Human        2 0.0230   0.322 
##  8 Kamino    Kaminoan     2 0.0230   0.345 
##  9 Kashyyyk  Wookiee      2 0.0230   0.368 
## 10 Mirial    Mirialan     2 0.0230   0.391 
## # ... with 48 more rows

Now our function accommodates multiple inputs in the tidyverse fashion! If you feel like reading more about Non-standard evaluation, go read the full documentation

To leave a comment for the author, please follow the link and comment on their blog: Digital Age Economist on Digital Age Economist.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)