Filtering for Unique Values in R- Using the dplyr

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post Filtering for Unique Values in R- Using the dplyr appeared first on Data Science Tutorials

Filtering for Unique Values in R, Using the dplyr package in R, you may filter for unique values in a data frame using the following methods.

Method 1: In one column, filter for unique values.

df %>% distinct(var1)

Method 2: Filtering for Unique Values in Multiple Columns

df %>% distinct(var1, var2)

Method 3: In all columns, filter for unique values.

df %>% distinct()

With the following data frame in R, the following examples explain how to utilize each method in practice.

Arrange Data by Month in R with example – Data Science Tutorials

create a data frame

df <- data.frame(team=c('X', 'X', 'X', 'X', 'Y', 'Y', 'Y', 'Y'),
                 rebounds =c('8', '6', '5', '4', '3', '8', '9', '5'),
                 points=c(107, 207, 208, 211, 213, 215, 219, 313))

Now we can view the data frame

df
   team rebounds points
1    X        8    107
2    X        6    207
3    X        5    208
4    X        4    211
5    Y        3    213
6    Y        8    215
7    Y        9    219
8    Y        5    313

Example 1: Column Filter for Unique Values

To filter for unique values in just the team column, we can use the following code.

Rejection Region in Hypothesis Testing – Data Science Tutorials

library(dplyr)

In the team column, only unique values should be selected.

df %>% distinct(team)
  team
1    X
2    Y

It’s worth noting that just the team column’s unique values are returned.

Example 2: Find Unique Values in Multiple Columns Using a Filter

To filter for unique values in the team and points columns, we can use the following code:

library(dplyr)

in the team and points columns, select unique values

df %>% distinct(team, points)
  team points
1    X    107
2    X    207
3    X    208
4    X    211
5    Y    213
6    Y    215
7    Y    219
8    Y    313

It’s worth noting that just the team and points columns’ unique values are returned.

Best Books to Learn R Programming – Data Science Tutorials

Example 3: Filter all columns for unique values

To filter for unique values across all columns in the data frame, we can use the following code.

library(dplyr)

choose unique values in all columns

df %>% distinct()
   team rebounds points
1    X        8    107
2    X        6    207
3    X        5    208
4    X        4    211
5    Y        3    213
6    Y        8    215
7    Y        9    219
8    Y        5    313

It’s worth noting that the unique values from each of the three columns are returned.

The post Filtering for Unique Values in R- Using the dplyr appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)