When the “reorder” function just isn’t good enough…

[This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something).  Take the following simple data frame:

df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”))

I expect that if I call the reorder function on the a2 vector, using the a1 vector as the vector to order the second one by, then any summary stats that I run on the a2 vector will be ordered according to the numbers in a1.  However, look what happens:

table(reorder(df$a2, df$a1))
c e h j 
2 1 2 2

I found out that in order to get it in the order specified by the numbers in the first vector, the following code seems to work:

df$a2 = factor(df$a2, levels=unique(df$a2)[order(unique(df$a1))], ordered=TRUE)

Now look at the result:

table(df$a2)

j c e h 
2 2 1 2

One thing I notice here is that R seems to be keeping the factor levels alphabetically organized. When I specify the levels by using the “unique” function, it allows itself to break the alphabetic organization.

Why won’t the reorder function work in this case?


To leave a comment for the author, please follow the link and comment on their blog: Data and Analysis with R, at Work.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)