Crosstab calculation in R

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post Crosstab calculation in R appeared first on Data Science Tutorials

Crosstab calculation in R, To create a crosstab using functions from the dplyr and tidyr packages in R, use the following basic syntax.

df %>%
  group_by(var1, var2) %>%
  tally() %>%
  spread(var1, n)

The examples below demonstrate how to utilize this syntax in practice.

Control Chart in Quality Control-Quick Guide – Data Science Tutorials

Example 1: Make a simple crosstab

Let’s say we have the following R data frame:

Let’s create a data frame

df <- data.frame(team=c('X', 'X', 'X', 'X', 'Y', 'Y', 'Y', 'Y'),
                 position=c('A', 'A', 'B', 'C', 'C', 'C', 'D', 'D'),
                 points=c(107, 207, 208, 211, 213, 215, 219, 313))

Now we can view the data frame

df
   team position points
1    X        A    107
2    X        A    207
3    X        B    208
4    X        C    211
5    Y        C    213
6    Y        C    215
7    Y        D    219
8    Y        D    313

To make a crosstab for the ‘team’ and ‘position’ variables, use the following syntax.

How to perform One-Sample Wilcoxon Signed Rank Test in R? – Data Science Tutorials

library(dplyr)
library(tidyr)

Now we can produce the crosstab

df %>%
  group_by(team, position) %>%
  tally() %>%
  spread(team, n)
  position     X     Y
  <chr>    <int> <int>
1 A            2    NA
2 B            1    NA
3 C            1     2
4 D           NA     2

Here’s we can infer the values in the crosstab.

There is 2 player who has a position of ‘A’ and belongs to team ‘X’

There is 1 player who has a position of ‘B’ and belongs to team ‘X’

Arrange Data by Month in R with example – Data Science Tutorials

It’s worth noting that we may change the crosstab’s rows and columns by changing the value used in the spread() function.

library(dplyr)
library(tidyr)

Let’s produce a crosstab with ‘position’ along with columns.

Rejection Region in Hypothesis Testing – Data Science Tutorials

df %>%
  group_by(team, position) %>%
  tally() %>%
  spread(position, n)
team      A     B     C     D
  <chr> <int> <int> <int> <int>
1 X         2     1     1    NA
2 Y        NA    NA     2     2

The post Crosstab calculation in R appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)