How to Use Spread Function in R?-tidyr Part1

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to Use Spread Function in R?-tidyr Part1 appeared first on Data Science Tutorials

How to Use Spread Function in R, To “spread” a key-value pair across multiple columns, use the spread() method from the tidyr package.

The basic syntax used by this function is as follows.

spread(data, key value)

where:

data: Name of the data frame

key: column whose values will serve as the names of variables

value: Column where new variables formed from keys will populate with values

How to Use Spread Function in R?

The practical application of this function is demonstrated in the examples that follow.

dplyr Techniques and Tips – Data Science Tutorials

Example 1: Divide Values Between Two Columns

Let’s say we have the R data frame shown below.

Let’s create a data frame

df <- data.frame(player=rep(c('A', 'B'), each=4),
year=rep(c(1, 1, 2, 2), times=2),
stat=rep(c('points', 'assists'), times=4),
amount=c(14, 6, 18, 7, 22, 9, 38, 4))

Now we can view the data frame

df
   player year    stat amount
1     P1    1  points    125
2     P1    1 assists    142
3     P1    2  points    145
4     P1    2 assists    157
5     P2    1  points    134
6     P2    1 assists    213
7     P2    2  points    125
8     P2    2 assists    214

The stat column’s values can be separated into separate columns using the spread() function.

library(tidyr)

Dividing the stats column into several columns

spread(df, key=stat, value=amount)
player year assists points
1     P1    1     142    125
2     P1    2     157    145
3     P2    1     213    134
4     P2    2     214    125

Example 2: Values Should Be Spread Across More Than Two Columns

Let’s say we have the R data frame shown below:

Imagine we have the following data frame

df2 <- data.frame(player=rep(c('P1'), times=8),
year=rep(c(1, 2), each=4),
stat=rep(c('points', 'assists', 'steals', 'blocks'), times=2),
amount=c(115, 116, 212, 211, 229, 319, 213, 314))

Now we can view the data frame

df2
  player year    stat amount
1     P1    1  points    115
2     P1    1 assists    116
3     P1    1  steals    212
4     P1    1  blocks    211
5     P1    2  points    229
6     P1    2 assists    319
7     P1    2  steals    213
8     P1    2  blocks    314

The spread() function can be used to create four additional columns from the stat column’s four distinct values.

library(tidyr)

Dividing the stats column into several columns

spread(df2, key=stat, value=amount)
   player year assists blocks points steals
1     P1    1     116    211    115    212
2     P1    2     319    314    229    213

How to Group and Summarize Data in R – Data Science Tutorials

Have you liked this article? If you could email it to a friend or share it on Facebook, Twitter, or Linked In, I would be eternally grateful.

Please use the like buttons below to show your support. Please remember to share and comment below. 

The post How to Use Spread Function in R?-tidyr Part1 appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)