Select the First Row by Group in R

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post Select the First Row by Group in R appeared first on Data Science Tutorials

Select the First Row by Group in R, using the dplyr package in R, you might wish to choose the first row in each group frequently. To do this, use the simple syntax shown below.

Select the First Row by Group in R

Let’s say we have the dataset shown below in R,

How to add labels at the end of each line in ggplot2?

Let’s put up a dataset

df <- data.frame(team=c('P1', 'P1', 'P1', 'P1', 'P2', 'P2', 'P2', 'P2', 'P3', 'P3'),
                 points=c(56, 94, 17, 57, 55, 15, 37, 44, 55, 32))

Now we can view the data frame

df
   team points
1    P1     56
2    P1     94
3    P1     17
4    P1     57
5    P2     55
6    P2     15
7    P2     37
8    P2     44
9    P3     55
10   P3     32

To choose the first row by the group in R, use the dplyr package as demonstrated in the code below.

Augmented Dickey-Fuller Test in R – Data Science Tutorials

library(dplyr)
df %>%
  group_by(team) %>%
  arrange(points) %>%
  filter(row_number()==1)
team  points
  <chr>  <dbl>
1 P2        15
2 P1        17
3 P3        32

The data are sorted in ascending order by arrange() by default, however, we may easily sort the values in descending order instead.

df %>%
  group_by(team) %>%
  arrange(desc(points)) %>%
  filter(row_number()==1)
  team  points
  <chr>  <dbl>
1 P1        94
2 P2        55
3 P3        55

Remember that this code may be simply changed to select the nth row for each group. Just modify row_number() == n.

Filter Using Multiple Conditions in R – Data Science Tutorials

or instance, you may use the following syntax to choose the second row by group:

df %>%
  group_by(team) %>%
  arrange(desc(points)) %>%
  filter(row_number()==2)
team  points
  <chr>  <dbl>
1 P1        57
2 P2        44
3 P3        32

Alternatively, you might employ the syntax shown below to choose the last row by the group.

How to perform the Kruskal-Wallis test in R? – Data Science Tutorials

df %>%
  group_by(team) %>%
  arrange(desc(points)) %>%
  filter(row_number()==n())
team  points
  <chr>  <dbl>
1 P3        32
2 P1        17
3 P2        15

The post Select the First Row by Group in R appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)