How to Use Mutate function in R

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to Use Mutate function in R appeared first on Data Science Tutorials

How to Use Mutate function in R, This article demonstrates how to add additional variables to a data frame using R’s mutate() function.

Artificial Intelligence Examples-Quick View – Data Science Tutorials

How to Use Mutate function in R

The dplyr library has the following functions that can be used to add additional variables to a data frame.

mutate() – adds new variables while retaining old variables to a data frame.

transmute() – adds new variables and removes old ones from a data frame.

mutate_all() –  changes every variable in a data frame simultaneously.

mutate_at() –  changes certain variables by name.

mutate_if() – alterations all variables that satisfy a specific criterion

Tips for Rearranging Columns in R – Data Science Tutorials

mutate()

A data frame’s existing variables are preserved when new variables are added using the mutate() function. The mutate() basic syntax is as follows.

data <- mutate(new_variable = existing_variable/3)

data: the fresh data frame where the fresh variables will be placed

new_variable: the name of the new variable

existing_variable: the current data frame variable that you want to modify in order to generate a new variable

As an illustration, the code that follows shows how to modify the built-in iris dataset to include a new variable called root sepal width.

glm function in r-Generalized Linear Models – Data Science Tutorials

The first six lines of the iris dataset should be defined as a data frame.

data <- head(iris)
data
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
library(dplyr)

Set the new column’s root sepal width to the sepal’s square root. variable width

How to perform the MANOVA test in R? – Data Science Tutorials

data %>% mutate(root_sepal_width = sqrt(Sepal.Width))
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species root_sepal_width
1          5.1         3.5          1.4         0.2  setosa         1.870829
2          4.9         3.0          1.4         0.2  setosa         1.732051
3          4.7         3.2          1.3         0.2  setosa         1.788854
4          4.6         3.1          1.5         0.2  setosa         1.760682
5          5.0         3.6          1.4         0.2  setosa         1.897367
6          5.4         3.9          1.7         0.4  setosa         1.974842

transmute()

A data frame’s variables are added and removed via the transmute() method. The code that follows demonstrates how to eliminate all of the existing variables and add two new variables to a dataset.

Checking Missing Values in R – Data Science Tutorials

The first six lines of the iris dataset should be defined as a data frame.

data <- head(iris)
data
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Create two new variables, then get rid of all the others.

Calculate the p-Value from Z-Score in R – Data Science Tutorials

data %>% transmute(root_sepal_width = sqrt(Sepal.Width),
                   root_petal_width = sqrt(Petal.Width))
   root_sepal_width root_petal_width
1         1.870829        0.4472136
2         1.732051        0.4472136
3         1.788854        0.4472136
4         1.760682        0.4472136
5         1.897367        0.4472136
6         1.974842        0.6324555

mutate_all()

The mutate_all() function changes every variable in a data frame at once, enabling you to use the funs() function to apply a certain function to every variable.

The use of mutate_all() to divide each column in a data frame by ten is demonstrated in the code below.

Augmented Dickey-Fuller Test in R – Data Science Tutorials

The first six rows of iris sans the Species variable as the new data frame.

data2 <- head(iris) %>% select(-Species)
data2

divide 10 from each of the data frame’s variables.

data2 %>% mutate_all(funs(./10))
Sepal.Length Sepal.Width Petal.Length Petal.Width
1         0.51        0.35         0.14        0.02
2         0.49        0.30         0.14        0.02
3         0.47        0.32         0.13        0.02
4         0.46        0.31         0.15        0.02
5         0.50        0.36         0.14        0.02
6         0.54        0.39         0.17        0.04

Remember that you can add more variables to the data frame by supplying a new name to be prefixed to the existing variable name.

How to Calculate Relative Frequencies in R? – Data Science Tutorials

data2 %>% mutate_all(funs(mod = ./10))
   Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length_mod
1          5.1         3.5          1.4         0.2             0.51
2          4.9         3.0          1.4         0.2             0.49
3          4.7         3.2          1.3         0.2             0.47
4          4.6         3.1          1.5         0.2             0.46
5          5.0         3.6          1.4         0.2             0.50
6          5.4         3.9          1.7         0.4             0.54
  Sepal.Width_mod Petal.Length_mod Petal.Width_mod
1            0.35             0.14            0.02
2            0.30             0.14            0.02
3            0.32             0.13            0.02
4            0.31             0.15            0.02
5            0.36             0.14            0.02
6            0.39             0.17            0.04

mutate_at()

Using names, the mutate at() function changes particular variables. The use of mutate_at() to divide two particular variables by 10 is demonstrated in the code below:

data2 %>% mutate_at(c("Sepal.Length", "Sepal.Width"), funs(mod = ./10))
Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length_mod
1          5.1         3.5          1.4         0.2             0.51
2          4.9         3.0          1.4         0.2             0.49
3          4.7         3.2          1.3         0.2             0.47
4          4.6         3.1          1.5         0.2             0.46
5          5.0         3.6          1.4         0.2             0.50
6          5.4         3.9          1.7         0.4             0.54
  Sepal.Width_mod
1            0.35
2            0.30
3            0.32
4            0.31
5            0.36
6            0.39

mutate_if()

All variables that match a specific condition are modified by the mutate_if() function.

The mutate_if() function can be used to change any variables of type factor to type character, as shown in the code below.

How to make a rounded corner bar plot in R? – Data Science Tutorials

data <- head(iris)
sapply(data, class)
Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species
   "numeric"    "numeric"    "numeric"    "numeric"     "factor"

every factor variable can be converted to a character variable.

new_data <- data %>% mutate_if(is.factor, as.character)
sapply(new_data, class)
Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species
   "numeric"    "numeric"    "numeric"    "numeric"  "character"

The mutate_if() method can be used to round any numeric variables to the nearest whole number using the following example code.

Calculate the P-Value from Chi-Square Statistic in R.Data Science Tutorials

In the first six rows of the iris dataset,

data <- head(iris)
data
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

any numeric variables should be rounded to the nearest decimal place.

data %>% mutate_if(is.numeric, round, digits = 0)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1            5           4            1           0  setosa
2            5           3            1           0  setosa
3            5           3            1           0  setosa
4            5           3            2           0  setosa
5            5           4            1           0  setosa
6            5           4            2           0  setosa

The post How to Use Mutate function in R appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)