Exploring NSE: enquo, quos and …

(This article was first published on Digital Age Economist on Digital Age Economist, and kindly contributed to R-bloggers)

As one gets more interested in building your own custom functions, you quickly start realising that unless your functions are tidyverse friendly, standardising your code workflow becomes a problem. So, how do you make your customs play well with your favourite tidyverse packages? Our friendly little helpers are going to be enquo and quos. I am going to build a function that calculates the proportion and cumulative proportion of a grouping variable.

suppressPackageStartupMessages(library(dplyr))

prop_count <- function(df, vars){
  vars_col <- enquo(vars)
  
  print(vars_col)
  
  df %>% 
    count(!!vars_col, sort = T) %>% 
    mutate(prop_n = prop.table(n)) %>% 
    mutate(cumsum_n = cumsum(prop_n)) 
}

dplyr::starwars %>% 
  prop_count(homeworld)
## 
##   expr: ^homeworld
##   env:  000000000C4567B8
## # A tibble: 49 x 4
##    homeworld     n prop_n cumsum_n
##               
##  1 Naboo        11 0.126     0.126
##  2 Tatooine     10 0.115     0.241
##  3          10 0.115     0.356
##  4 Alderaan      3 0.0345    0.391
##  5 Coruscant     3 0.0345    0.425
##  6 Kamino        3 0.0345    0.460
##  7 Corellia      2 0.0230    0.483
##  8 Kashyyyk      2 0.0230    0.506
##  9 Mirial        2 0.0230    0.529
## 10 Ryloth        2 0.0230    0.552
## # ... with 39 more rows

From the output we can see that quosures are quoted expressions that keep track of an environment or function and we can use the bang bang (!!) to evaluate (or unquote) the columns. What happens when we are looking to get the proportional count of multiple variable?

dplyr::starwars %>% 
  prop_count(homeworld, species)
## Error in prop_count(., homeworld, species): unused argument (species)

We get an error, as the second argument in the function is interpreted as exactly that, a second argument. We want our function to accommodate multiple grouping variables. This is where quos and ... come in. The ellips is analogous to multiple arguments or input.

prop_count <- function(df, ...){
  vars_col <- quos(...)
  
  print(vars_col)
  
  df %>% 
    count(!!!vars_col, sort = T) %>% 
    mutate(prop_n = prop.table(n)) %>% 
    mutate(cumsum_n = cumsum(prop_n)) 
}

dplyr::starwars %>% 
  prop_count(homeworld, species)
## [[1]]
## 
##   expr: ^homeworld
##   env:  000000000BFAE918
## 
## [[2]]
## 
##   expr: ^species
##   env:  000000000BFAE918
## # A tibble: 58 x 5
##    homeworld species      n prop_n cumsum_n
##                   
##  1 Tatooine  Human        8 0.0920   0.0920
##  2 Naboo     Human        5 0.0575   0.149 
##  3       Human        5 0.0575   0.207 
##  4 Alderaan  Human        3 0.0345   0.241 
##  5 Naboo     Gungan       3 0.0345   0.276 
##  6 Corellia  Human        2 0.0230   0.299 
##  7 Coruscant Human        2 0.0230   0.322 
##  8 Kamino    Kaminoan     2 0.0230   0.345 
##  9 Kashyyyk  Wookiee      2 0.0230   0.368 
## 10 Mirial    Mirialan     2 0.0230   0.391 
## # ... with 48 more rows

Now our function accommodates multiple inputs in the tidyverse fashion! If you feel like reading more about Non-standard evaluation, go read the full documentation

To leave a comment for the author, please follow the link and comment on their blog: Digital Age Economist on Digital Age Economist.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)