When the “reorder” function just isn’t good enough…

May 6, 2013
By

(This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers)

The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something).  Take the following simple data frame:

df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”))

I expect that if I call the reorder function on the a2 vector, using the a1 vector as the vector to order the second one by, then any summary stats that I run on the a2 vector will be ordered according to the numbers in a1.  However, look what happens:

table(reorder(df$a2, df$a1))
c e h j 
2 1 2 2

I found out that in order to get it in the order specified by the numbers in the first vector, the following code seems to work:

df$a2 = factor(df$a2, levels=unique(df$a2)[order(unique(df$a1))], ordered=TRUE)

Now look at the result:

table(df$a2)

j c e h 
2 2 1 2

One thing I notice here is that R seems to be keeping the factor levels alphabetically organized. When I specify the levels by using the “unique” function, it allows itself to break the alphabetic organization.

Why won’t the reorder function work in this case?


To leave a comment for the author, please follow the link and comment on his blog: Data and Analysis with R, at Work.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.