Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to Find Unmatched Records in R appeared first on Data Science Tutorials

How to Find Unmatched Records in R?, To retrieve all rows in one data frame that do not have matching values in another data frame, use R’s anti_join() function from the dplyr package.

The basic syntax used by this function is as follows.

How to Remove Columns from a data frame in R – Data Science Tutorials

`anti_join(df1, df2, by='col_name')`

The usage of this syntax is demonstrated in the examples that follow.

## Example 1: Use anti_join() with One Column

Suppose we have the two R data frames shown below:

Let’s build data frames

```df1 <- data.frame(Q1 = c('a', 'b', 'c', 'd', 'e', 'f'),
Q2 = c(152, 514, 114, 218, 322, 323))
df2 <- data.frame(Q1 = c('a', 'a', 'a', 'b', 'b', 'b'),
Q3 = c(523, 324, 233, 134, 237, 141))```

To return all rows in the first data frame that don’t have a matching Q1 in the second data frame, we can use the anti_join() function.

Bind together two data frames by their rows or columns in R (datasciencetut.com)

`library(dplyr)`

use the ‘Q1’ column to perform anti join

```anti_join(df1, df2, by='Q1')
Q1  Q2
1  c 114
2  d 218
3  e 322
4  f 323```

We can see that there are exactly 4 Q1’s from the first data frame that does not have a matching Q1 name in the second data frame.

## Example 2: Use anti_join() with Multiple Columns

Suppose we have the two R data frames shown below.

How to Join Data Frames for different column names in R (datasciencetut.com)

Let’s create a data frames

```df1 <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
position=c('G', 'G', 'F', 'G', 'F', 'C'),
points=c(152, 114, 219, 254, 356, 441))
df2 <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
position=c('G', 'G', 'C', 'G', 'F', 'F'),
points=c(142, 214, 319, 133, 517, 422))```

All rows in the first data frame that lack a matching team and position in the second data frame can be returned using the anti_join() function:

`library(dplyr)`

utilizing the columns for “team” and “position,” perform anti _join.

How to Count Distinct Values in R – Data Science Tutorials

```anti_join(df1, df2, by=c('team', 'position'))
team position points
1    A        F    219
2    B        C    441```

We can see that there are exactly two records from the first data frame that do not have a matching team name and position in the second data frame.

The post How to Find Unmatched Records in R appeared first on Data Science Tutorials