Good riddance to Excel pivot tables

January 30, 2011
By

(This article was first published on Recology, and kindly contributed to R-bloggers)

Excel pivot tables have been how I have reorganized data…up until now. These are just a couple of examples why R is superior to Excel for reorganizing data:

`################ Good riddance to pivot tables ############library(reshape2)library(plyr) dataset <- data.frame(var1 = rep(c("a","b","c","d","e","f"), each = 4),  var2 = rep(c("level1","level1","level2","level2"), 6),  var3 = rep(c("h","m"), 12), meas = rep(1:12))`

Created by Pretty R at inside-R.org

`# simply pivot tablecast(dataset, var1 ~ var2 + var3)Using meas as value column.  Use the value argument to cast to override this choice  var1 level1_h level1_m level2_h level2_m1    a        1        2        3        42    b        5        6        7        83    c        9       10       11       124    d        1        2        3        45    e        5        6        7        86    f        9       10       11       12`
` # mean by var1 and var2cast(dataset, var1 ~ var2, mean)`
`Using meas as value column.  Use the value argument to cast to override this choice  var1 level1 level21    a    1.5    3.52    b    5.5    7.53    c    9.5   11.54    d    1.5    3.55    e    5.5    7.56    f    9.5   11.5 # mean by var1 and var3cast(dataset, var1 ~ var3, mean)Using meas as value column.  Use the value argument to cast to override this choice  var1  h  m1    a  2  32    b  6  73    c 10 114    d  2  35    e  6  76    f 10 11`
` # mean by var1, var2 and var3 (version 1)cast(dataset, var1 ~ var2 + var3, mean)`
`Using meas as value column.  Use the value argument to cast to override this choice  var1 level1_h level1_m level2_h level2_m1    a        1        2        3        42    b        5        6        7        83    c        9       10       11       124    d        1        2        3        45    e        5        6        7        86    f        9       10       11       12 # mean by var1, var2 and var3 (version 2)cast(dataset, var1 + var2 ~ var3, mean)`
`Using meas as value column.  Use the value argument to cast to override this choice   var1   var2  h  m1     a level1  1  22     a level2  3  43     b level1  5  64     b level2  7  85     c level1  9 106     c level2 11 127     d level1  1  28     d level2  3  49     e level1  5  610    e level2  7  811    f level1  9 1012    f level2 11 12 # use package plyr to create flexible data frames...dataset_plyr <- ddply(dataset, .(var1, var2), summarise,  mean = mean(meas),  se = sd(meas), CV = sd(meas)/mean(meas))> dataset_plyr   var1   var2 mean        se         CV1     a level1  1.5 0.7071068 0.471404522     a level2  3.5 0.7071068 0.202030513     b level1  5.5 0.7071068 0.128564874     b level2  7.5 0.7071068 0.094280905     c level1  9.5 0.7071068 0.074432296     c level2 11.5 0.7071068 0.061487557     d level1  1.5 0.7071068 0.471404528     d level2  3.5 0.7071068 0.202030519     e level1  5.5 0.7071068 0.1285648710    e level2  7.5 0.7071068 0.0942809011    f level1  9.5 0.7071068 0.0744322912    f level2 11.5 0.7071068 0.06148755`

` # ...to use for plottingqplot(var1, mean, colour = var2, size = CV, data = dataset_plyr, geom = "point")`

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: , ,