Hack: The “count(case when … else … end)” in dplyr

[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When I run quires in SQL (or even HiveQL, Spark SQL and so on), it is quite common to use the syntax of count(case when.. else ... end). Today, I will provide you an example of how you run this type of commands in dplyr.

Let’s start:

library(sqldf)
library(dplyr)


df<-data.frame(id = 1:10,
               gender = c("m","m","m","f","f","f","m","f","f","f"),
               amt= c(5,20,30,10,20,50,5,20,10,30))

df
 

Let’s get the count and the sum per gender in different columns in SQL.

sqldf("select count(case when gender='m' then id else null end) as male_cnt,
              count(case when gender='f' then id else null end) as female_cnt,
              sum(case when gender='m' then amt else 0 end) as male_amt,
              sum(case when gender='f' then amt else 0 end) as female_amt
              from df")
 

Output:

  male_cnt female_cnt male_amt female_amt
1        4          6       60        140

Let’s get the same output in dplyr. We will need to subset the data frame based on one column.

df%>%summarise(male_cnt=length(id[gender=="m"]),
               female_cnt=length(id[gender=="f"]),
               male_amt=sum(amt[gender=="m"]),
               female_amt=sum(amt[gender=="f"])
               )
 

Output:

  male_cnt female_cnt male_amt female_amt
1        4          6       60        140

To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)