[R] tidy evaluation in ggplot2 (part 2)

[This article was first published on R on Zhenguo Zhang's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Zhenguo Zhang’s Blog /2023/11/11/r-tidy-evaluation-in-ggplot2-part-2/ –
knitr::opts_chunk$set(echo=T, fig.align = "center", fig.width = 6, fig.height = 5, dpi=150, warning=FALSE)
library(knitr)
library(ggplot2)

How to use variables to refer columns in ggplot2

Let’s start an example of making a scatter plot by using the dataset mtcars.

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

We can make a plot using the following commands

plt<-ggplot(mtcars, aes(x=wt, y=mpg, color=factor(cyl))) + theme_bw()
plt<-plt+geom_point()
plt

If we create a function, how can we make the sample plot:

use ‘{{}}’

make_scatter_plot1<-function(colorVar) {
  plt<-ggplot(mtcars, aes(x=wt, y=mpg, color=factor({{colorVar}}))) + theme_bw()
  plt<-plt+geom_point()
  plt
}
make_scatter_plot1(cyl)

The embrace operator ‘⁠{{’⁠ is used to create functions that call other data-masking functions. It transports a data-masked argument (an argument that can refer to columns of a data frame) from one function to another.

use .data

make_scatter_plot2<-function(colorVar) {
  plt<-ggplot(mtcars, aes(x=wt, y=mpg, color=factor(.data[[colorVar]]))) + theme_bw()
  plt<-plt+geom_point()
  plt
}
make_scatter_plot2("cyl")

use sym()

make_scatter_plot3<-function(colorVar) {
  plt<-ggplot(mtcars, aes(x=wt, y=mpg, color=factor(!!sym(colorVar)) )) + theme_bw()
  plt<-plt+geom_point()
  plt
}
make_scatter_plot3("cyl")

use quo()

make_scatter_plot4<-function(colorVar) {
  plt<-ggplot(mtcars, aes(x=wt, y=mpg, color=factor(!!colorVar) )) + theme_bw()
  plt<-plt+geom_point()
  plt
}
myvar<-quo(cyl)
make_scatter_plot4(myvar)

Key concepts

defusing

When a piece of R code is defused, R doesn’t return its value, but returns the expression in a tree-like object to describe how to compute a value, which can be evaluated later in a specific environment.

For example, one can use expr() to create a defused object.

# return result
1+1
## [1] 2
# return defused object
expr(1+1)
## 1 + 1

One application of using defused object is to resume its evaluation in data mask, see the example below:

e <- expr(mean(cyl))
eval(e, mtcars)
## [1] 6.1875

Some functions defuse their arguments, and when these function are called from another function, its argument must be provided with embrace operator {{, which is a combination of enquo() and !!.

Defused expressions can also be created by using other functions such as sym() and call(), for example:

# create a symbol
vName<-"mtcars"
sym(vName)

# make a call
call("head", sym(vName))

There are two ways to defuse expressions, one can use expr() to defuse one’s own expression and use enquo() to defuse user-provided expressions to your function.

Defusing is known as quoting in other frameworks.

  • The equivalent of expr() is [base::bquote()].

  • The equivalent of enquo() is [base::substitute()]. The latter returns a naked expression instead of a quosure.

  • There is no equivalent for enquos(…) but you can defuse dots as a list of naked expressions with eval(substitute(alist(…))).

data-masking

Data masking means that a data.frame is inserted at the bottom of the current chain of environment and it takes higher precedence than the user environment.

For example, if we assign a value to a variable, but this variable is invisible in data-masking functions as below:

mpg<-100
# the following function can't see it
dplyr::summarise(mtcars, mean(mpg))
##   mean(mpg)
## 1  20.09062

As you can see, the function used the column mpg from the dataframe mtcars.

One way to disambiguate the user and mask environments is to use tidy eval pronouns.

dplyr::summarise(mtcars, mean(.env$mpg))
##   mean(.env$mpg)
## 1            100

Summary

Basically, when one wants to evaluate a variable in a different context, he should defuse the variable (using function such as quo() and enquo()) and then inject it (using {{}}, !!, !!sym(), etc) when evaluation is needed.

- /2023/11/11/r-tidy-evaluation-in-ggplot2-part-2/ -
To leave a comment for the author, please follow the link and comment on their blog: R on Zhenguo Zhang's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)