Animating Data Transformations III – separate()

[This article was first published on R Tutorials – Omni Analytics Group, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We recently have published two blogs on animating data transformations. The first, Animating Data Transformations, illustrated the spread() and gather() functions for going between wide and tall representations of data. The second, Animating Data Transformations II, covered the unnest() function for transforming a list column into a one value per row format. Today, we’re going to introduce and animate the separate() function, which converts data with a single column containing multiple variables into a tidy, one-column-per-variable format.

First, let’s create and view some sample data, where one column is actually the concatenation of three variables: two treatments, and one value.

library(tidyverse)
library(gganimate)

sample_data <- tibble(
    ID = 1:10,
    `TRT1_TRT2_Value` = paste(sample(LETTERS[1:3], 10, replace = TRUE), 
                              sample(LETTERS[1:3], 10, replace = TRUE),
                              round(rnorm(10)), sep = "_")
)

sample_data

# A tibble: 10 x 2
      ID TRT1_TRT2_Value
   <int> <chr>          
 1     1 B_A_1          
 2     2 B_A_-4         
 3     3 B_B_2          
 4     4 C_B_0          
 5     5 A_C_2          
 6     6 A_B_0          
 7     7 A_C_1          
 8     8 B_A_-1         
 9     9 B_B_0          
10    10 A_A_0 

Next, we will use the separate() function and specify the into parameter, as well as the sep character (an underscore, in this case):

sample_data_separated <- sample_data %>%
    separate("TRT1_TRT2_Value", into = c("Treatment 1", "Treatment 2", "Value"), sep = "_")

sample_data_separated
# A tibble: 10 x 4
      ID `Treatment 1` `Treatment 2` Value
   <int> <chr>         <chr>         <chr>
 1     1 B             A             1    
 2     2 B             A             -4   
 3     3 B             B             2    
 4     4 C             B             0    
 5     5 A             C             2    
 6     6 A             B             0    
 7     7 A             C             1    
 8     8 B             A             -1   
 9     9 B             B             0    
10    10 A             A             0  

Next, we perform a similar routine to the previous blogs and combine the two datasets into one dataset which will be used to build the animation:

longDat <- function(x) {
    names(x) %>%
        rbind(x) %>%
        setNames(seq_len(ncol(x))) %>%
        mutate(row = row_number()) %>%
        tidyr::gather(column, value, -row) %>%
        mutate(column = as.integer(column)) %>%
        ungroup() %>%
        arrange(column, row)
}

long_tables <- map(list(sample_data, sample_data_separated), longDat)

combined_table <- long_tables[[1]] %>% 
    mutate(tstep = "a")

separated_table <- long_tables[[2]] %>% 
    mutate(tstep = "b")

both_tables <- bind_rows(combined_table, separated_table)
both_tables$celltype[both_tables$column == 1] <- c("header", rep("id", 10), "header2", rep("id", 10))
both_tables$celltype[both_tables$column == 2] <- c("header", rep("value_treatment", 10), "header2", rep("treatment", 10))
both_tables$celltype[both_tables$column == 3] <- c("header2", rep("treatment", 10))
both_tables$celltype[both_tables$column == 4] <- c("header2", rep("value", 10))

both_tables
# A tibble: 66 x 5
     row column value tstep celltype
   <int>  <int> <chr> <chr> <chr>   
 1     1      1 ID    a     header  
 2     2      1 1     a     id      
 3     3      1 2     a     id      
 4     4      1 3     a     id      
 5     5      1 4     a     id      
 6     6      1 5     a     id      
 7     7      1 6     a     id      
 8     8      1 7     a     id      
 9     9      1 8     a     id      
10    10      1 9     a     id      
# … with 56 more rows

From this, we can produce static versions of the two images which will form the basis for the animation:

base_plot <- ggplot(both_tables, aes(column, -row, fill = celltype)) +
    geom_tile(color = "black") + 
    geom_text(aes(label = value), size = 6, fontface = "bold") +
    theme_void() +
    scale_fill_manual(values = c("grey85", "grey85", "#ffebcc", "#d6e5ff", "#ffd6d7", "#f2d6ff"),
                      name = "",
                      labels = c("Header", "", "ID", "Treatment", "Value", "Value_Treatment"),
                      breaks = c("header", "", "id", "treatment", "value", "value_treatment")) +
    theme(
        plot.margin = unit(c(1, 1, 1, 1), "cm")
    )
p0 <- base_plot + 
    facet_wrap(~tstep)
p0

Finally, we use gganimate to build the final animation!

p1 <- base_plot +
    transition_states(
        states            = tstep,
        transition_length = 1,
        state_length      = 1
    ) +
    enter_fade() +
    exit_fade() +
    ease_aes('sine-in-out')

p1_animate <- animate(p1, height = 800, width = 1200, fps = 20, duration = 10)
anim_save("separate_animate.gif")

We hope you’ve enjoyed this third installment in our animating data transformations series! Stay tuned for more!

The post Animating Data Transformations III – separate() appeared first on Omni Analytics Group.

To leave a comment for the author, please follow the link and comment on their blog: R Tutorials – Omni Analytics Group.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)