How to do Conditional Mutate in R?

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to do Conditional Mutate in R? appeared first on Data Science Tutorials

How to do Conditional Mutate in R, It’s common to wish to add a new variable based on a condition to an existing data frame. The mutate() and case when() functions from the dplyr package make this task fortunately simple.

Cumulative Sum calculation in R – Data Science Tutorials

With the following data frame, this lesson provides numerous examples of how to apply these functions.

How to do Conditional Mutate in R

Let’s create a data frame

df <- data.frame(player = c('P1', 'P2', 'P3', 'P4', 'P5'),
position = c('A', 'B', 'A', 'B', 'B'),
points = c(102, 215, 319, 125, 112),
rebounds = c(22, 12, 19, 23, 36))

Let’s view the data frame

df
   player position points rebounds
1     P1        A    102       22
2     P2        B    215       12
3     P3        A    319       19
4     P4        B    125       23
5     P5        B    112       36

Example 1: Based on one existing variable, create a new variable

A new variable called “score” can be created using the following code depending on the value in the “points” column.

Top Data Science Skills to Get You Hired »

library(dplyr)

Let’s define new variable ‘score’ using mutate() and case_when()

df %>%
  mutate(score = case_when(points < 105 ~ 'LOW',
  points < 212 ~ 'MED',
  points < 450 ~ 'HIGH'))
  player position points rebounds score
1     P1        A    102       22   LOW
2     P2        B    215       12  HIGH
3     P3        A    319       19  HIGH
4     P4        B    125       23   MED
5     P5        B    112       36   MED

Example 2: Based on a number of existing variables, create a new variable

The following code demonstrates how to make a new variable called “type” based on the player and position values in the player column.

Tips for Rearranging Columns in R – Data Science Tutorials

library(dplyr)

Now we can define the  new variable ‘Type’ using mutate() and case_when()

df %>%
  mutate(Type = case_when(player == 'P1' | player == 'P2' ~ 'starter',
  player == 'P3' | player == 'P4' ~ 'backup',
  position == 'B' ~ 'reserve'))
   player position points rebounds    Type
1     P1        A    102       22 starter
2     P2        B    215       12 starter
3     P3        A    319       19  backup
4     P4        B    125       23  backup
5     P5        B    112       36 reserve

In order to generate a new variable called “value” depending on the value in the points and rebounds columns, use the following code.

Best online course for R programming – Data Science Tutorials

library(dplyr)

Let’s define the new variable ‘value’ using mutate() and case_when()

df %>%
  mutate(value = case_when(points <= 102 & rebounds <=45 ~ 2,
  points <=215 & rebounds > 55 ~ 4,
  points < 225 & rebounds < 28 ~ 6,
  points < 325 & rebounds > 29 ~ 7,
  points >=25 ~ 9))
player position points rebounds value
1     P1        A    102       22     2
2     P2        B    215       12     6
3     P3        A    319       19     9
4     P4        B    125       23     6
5     P5        B    112       36     7

Hope now you are clear with the concept.

The post How to do Conditional Mutate in R? appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)