Why is using list() critical for .dots = setNames() uses in dplyr?
[This article was first published on Higher Order Functions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I wrote an answer
about why setNames()
shows up sometimes in standard evaluation with dplyr.
My explanation turned into a mini-tutorial on why those standard evaluation
functions have a .dots
argument. The basic idea is that the usual variadic
argument ...
is a series of expressions that get evaluated inside of the
dataframe.
<span class="n">library</span><span class="p">(</span><span class="s2">"dplyr"</span><span class="p">)</span><span class="w">
</span><span class="c1"># standardize and round
</span><span class="n">z_round</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">.</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">scale</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">as.numeric</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="m">2</span><span class="p">)</span><span class="w">
</span><span class="c1"># The two expressions defining zSL, zSW are the `...`
</span><span class="n">iris</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">mutate_</span><span class="p">(</span><span class="n">zSL</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">z_round</span><span class="p">(</span><span class="n">Sepal.Length</span><span class="p">),</span><span class="w">
</span><span class="n">zSW</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">z_round</span><span class="p">(</span><span class="n">Sepal.Width</span><span class="p">))</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">tbl_df</span><span class="w">
</span><span class="c1">#> # A tibble: 150 × 7
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species zSL zSW
#> <dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa -0.90 1.02
#> 2 4.9 3.0 1.4 0.2 setosa -1.14 -0.13
#> 3 4.7 3.2 1.3 0.2 setosa -1.38 0.33
#> 4 4.6 3.1 1.5 0.2 setosa -1.50 0.10
#> 5 5.0 3.6 1.4 0.2 setosa -1.02 1.25
#> 6 5.4 3.9 1.7 0.4 setosa -0.54 1.93
#> 7 4.6 3.4 1.4 0.3 setosa -1.50 0.79
#> 8 5.0 3.4 1.5 0.2 setosa -1.02 0.79
#> 9 4.4 2.9 1.4 0.2 setosa -1.74 -0.36
#> 10 4.9 3.1 1.5 0.1 setosa -1.14 0.10
#> # ... with 140 more rows
</span>
If we programmatically assemble or manipulate those expressions before calling
mutate_
, we can’t use that ...
, because we have a list of expressions, not
a series of individual expressions. We use the .dots
argument instead.
<span class="n">exps</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
</span><span class="n">zSL</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">z_round</span><span class="p">(</span><span class="n">Sepal.Length</span><span class="p">),</span><span class="w">
</span><span class="n">zSW</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">z_round</span><span class="p">(</span><span class="n">Sepal.Width</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w">
</span><span class="n">iris</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">mutate_</span><span class="p">(</span><span class="n">exps</span><span class="p">)</span><span class="w">
</span><span class="c1">#> Error in UseMethod("as.lazy"): no applicable method for 'as.lazy' applied to an object of class "list"
</span><span class="w">
</span><span class="n">iris</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">mutate_</span><span class="p">(</span><span class="n">.dots</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">exps</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">tbl_df</span><span class="w">
</span><span class="c1">#> # A tibble: 150 × 7
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species zSL zSW
#> <dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa -0.90 1.02
#> 2 4.9 3.0 1.4 0.2 setosa -1.14 -0.13
#> 3 4.7 3.2 1.3 0.2 setosa -1.38 0.33
#> 4 4.6 3.1 1.5 0.2 setosa -1.50 0.10
#> 5 5.0 3.6 1.4 0.2 setosa -1.02 1.25
#> 6 5.4 3.9 1.7 0.4 setosa -0.54 1.93
#> 7 4.6 3.4 1.4 0.3 setosa -1.50 0.79
#> 8 5.0 3.4 1.5 0.2 setosa -1.02 0.79
#> 9 4.4 2.9 1.4 0.2 setosa -1.74 -0.36
#> 10 4.9 3.1 1.5 0.1 setosa -1.14 0.10
#> # ... with 140 more rows
</span>
To leave a comment for the author, please follow the link and comment on their blog: Higher Order Functions.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.