A Quick Appreciation of the R transform Function

September 10, 2018
By

(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers)

R users who also use the dplyr package will be able to quickly understand the following code that adds an estimated area column to a data.frame.

suppressPackageStartupMessages(library("dplyr"))

iris %>%
  mutate(
    ., 
    Petal.Area = (pi/4)*Petal.Width*Petal.Length) %>%
  head(.)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Area
## 1          5.1         3.5          1.4         0.2  setosa  0.2199115
## 2          4.9         3.0          1.4         0.2  setosa  0.2199115
## 3          4.7         3.2          1.3         0.2  setosa  0.2042035
## 4          4.6         3.1          1.5         0.2  setosa  0.2356194
## 5          5.0         3.6          1.4         0.2  setosa  0.2199115
## 6          5.4         3.9          1.7         0.4  setosa  0.5340708

The notation we used above is the "explicit argument" variation we recommend for readability. What a lot of dplyr users do not seem to know is: base-R already has this functionality. The function is called transform().

To demonstrate this, let’s first detach dplyr to show that we are not using functions from dplyr.

detach("package:dplyr", unload = TRUE)

Now let’s write the equivalent pipeline using exclusively base-R.

iris ->.
   transform(
     ., 
     Petal.Area = (pi/4)*Petal.Width*Petal.Length) ->.
   head(.)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Area
## 1          5.1         3.5          1.4         0.2  setosa  0.2199115
## 2          4.9         3.0          1.4         0.2  setosa  0.2199115
## 3          4.7         3.2          1.3         0.2  setosa  0.2042035
## 4          4.6         3.1          1.5         0.2  setosa  0.2356194
## 5          5.0         3.6          1.4         0.2  setosa  0.2199115
## 6          5.4         3.9          1.7         0.4  setosa  0.5340708

The "->." notation is a the end-of-line variation of the Bizarro Pipe. The transform() function has been part of R since 1998. dplyr::mutate() was introduced in 2014.

git log --all -p --reverse --source -S 'transform <-'

commit 41c2f7338c45dbf9eac99c210206bc3657bca98a refs/remotes/origin/tags/R-0-62-4
Author: pd 
Date:   Wed Feb 11 18:31:12 1998 +0000

    Added the frametools functions subset() and transform()
    
    git-svn-id: https://svn.r-project.org/R/[email protected] 00db46b3-68df-0310-9c12-caf00c1e9a41

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)