Combine Rows with Same Column Values in R

[This article was first published on Data Analysis in R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post Combine Rows with Same Column Values in R appeared first on finnstats.

If you are interested to learn more about data science, you can find more articles here finnstats.

Combine Rows with Same Column Values in R, To combine rows with the same column values in a data frame in R, use the basic syntax shown below.

library(dplyr)
df %>%
  group_by(group_var1, group_var2) %>%
  summarise(across(c(values_var1, values_var2), sum))

The usage of this syntax in practice is demonstrated by the example that follows.

rbind in r-Combine Vectors, Matrix or Data Frames by Rows »

Combining Rows with the Same Column Values in R

Consider the following data set, which details the sales and refunds that various corporate employees have made:

Make a data frame first.

df <- data.frame(id=c(11, 11, 12, 13, 13, 13),
                 Name=c('A', 'A', 'B', 'C', 'C', 'C'),
                 score=c(4, 1, 3, 2, 5, 3),
                 rank=c(1, 2, 2, 1, 3, 2))

Now we can view the data frame

df
id Name score rank
1 11    A     4    1
2 11    A     1    2
3 12    B     3    2
4 13    C     2    1
5 13    C     5    3
6 13    C     3    2

In order to combine rows with the same value in the id and name columns and then aggregate the other columns, use the syntax shown below.

How to Join Multiple Data Frames in R – Data Science Tutorials

library(dplyr)

combine name and ID-matched rows, then sum the remaining columns.

df %>%
  group_by(id, Name) %>%
  summarise(across(c(score, rank), sum))
# A tibble: 3 x 4
# Groups:   id [3]
     id employee sales returns
1   101 Dan          5       3
2   102 Rick         3       2
3   103 Ken         10       6

The outcome is a data frame that adds every row from the original data frame whose values for the id and name columns were the same before computing the total of the values for the score and rank columns.

You can choose to aggregate by a different statistic, such as the mean if you’d like. We choose to combine the sales and returns columns using the sum function.

Which programming language should I learn? »

If you are interested to learn more about data science, you can find more articles here finnstats.

The post Combine Rows with Same Column Values in R appeared first on finnstats.

To leave a comment for the author, please follow the link and comment on their blog: Data Analysis in R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)