tidyr 0.5.0

June 13, 2016
By

(This article was first published on RStudio Blog, and kindly contributed to R-bloggers)

I’m pleased to announce tidyr 0.5.0. tidyr makes it easy to “tidy” your data, storing it in a consistent form so that it’s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the tidy data vignette. Install it with:

install.packages("tidyr")

This release has three useful new features:

  1. separate_rows() separates values that contain multiple values separated by a delimited into multiple rows. Thanks to Aaron Wolen for the contribution!
    df <- data_frame(x = 1:2, y = c("a,b", "d,e,f"))
    df %>% 
      separate_rows(y, sep = ",")
    #> Source: local data frame [5 x 2]
    #> 
    #>       x     y
    #>    
    #> 1     1     a
    #> 2     1     b
    #> 3     2     d
    #> 4     2     e
    #> 5     2     f

    Compare with separate() which separates into (named) columns:

    df %>% 
      separate(y, c("y1", "y2", "y3"), sep = ",", fill = "right")
    #> Source: local data frame [2 x 4]
    #> 
    #>       x    y1    y2    y3
    #> *    
    #> 1     1     a     b  
    #> 2     2     d     e     f
  2. spread() gains a sep argument. Setting this will name columns as “key|sep|value”. This is useful when you’re spreading based on a numeric column:
    df <- data_frame(
      x = c(1, 2, 1), 
      key = c(1, 1, 2), 
      val = c("a", "b", "c")
    )
    df %>% spread(key, val)
    #> Source: local data frame [2 x 3]
    #> 
    #>       x     1     2
    #> *   
    #> 1     1     a     c
    #> 2     2     b  
    df %>% spread(key, val, sep = "_")
    #> Source: local data frame [2 x 3]
    #> 
    #>       x key_1 key_2
    #> *   
    #> 1     1     a     c
    #> 2     2     b  
  3. unnest() gains a .sep argument. This is useful if you have multiple columns of data frames that have the same variable names:
    df <- data_frame(
      x = 1:2,
      y1 = list(
        data_frame(y = 1),
        data_frame(y = 2)
      ),
      y2 = list(
        data_frame(y = "a"),
        data_frame(y = "b")
      )
    )
    df %>% unnest()
    #> Source: local data frame [2 x 3]
    #> 
    #>       x     y     y
    #>     
    #> 1     1     1     a
    #> 2     2     2     b
    df %>% unnest(.sep = "_")
    #> Source: local data frame [2 x 3]
    #> 
    #>       x  y1_y  y2_y
    #>     
    #> 1     1     1     a
    #> 2     2     2     b

    It also gains a .id column that makes the names of the list explicit:

    df <- data_frame(
      x = 1:2,
      y = list(
        a = 1:3,
        b = 3:1
      )
    )
    df %>% unnest()
    #> Source: local data frame [6 x 2]
    #> 
    #>       x     y
    #>    
    #> 1     1     1
    #> 2     1     2
    #> 3     1     3
    #> 4     2     3
    #> 5     2     2
    #> 6     2     1
    df %>% unnest(.id = "id")
    #> Source: local data frame [6 x 3]
    #> 
    #>       x     y    id
    #>     
    #> 1     1     1     a
    #> 2     1     2     a
    #> 3     1     3     a
    #> 4     2     3     b
    #> 5     2     2     b
    #> 6     2     1     b

tidyr 0.5.0 also includes a bumper crop of bug fixes, including fixes for spread() and gather() in the presence of list-columns. Please see the release notes for a complete list of changes.

To leave a comment for the author, please follow the link and comment on their blog: RStudio Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)