When the “reorder” function just isn’t good enough…

May 6, 2013
By

(This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers)

The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something).  Take the following simple data frame:

df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”))

I expect that if I call the reorder function on the a2 vector, using the a1 vector as the vector to order the second one by, then any summary stats that I run on the a2 vector will be ordered according to the numbers in a1.  However, look what happens:

```table(reorder(df\$a2, df\$a1))
c e h j
2 1 2 2```

I found out that in order to get it in the order specified by the numbers in the first vector, the following code seems to work:

df\$a2 = factor(df\$a2, levels=unique(df\$a2)[order(unique(df\$a1))], ordered=TRUE)

Now look at the result:

```table(df\$a2)

j c e h
2 2 1 2```

One thing I notice here is that R seems to be keeping the factor levels alphabetically organized. When I specify the levels by using the “unique” function, it allows itself to break the alphabetic organization.

Why won’t the reorder function work in this case?

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...