Drop unused factor levels

Posted on February 4, 2008 by Forester in Uncategorized | 0 Comments

[This article was first published on Quantitative Ecology, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When creating a subset of a dataframe, I often exclude rows based on the level of a factor. However, the “levels” of the factor remain intact. This is the intended behavior of R, but it can cause problems in some cases. I finally discovered how to clean up levels in this post to R-Help. Here is an example:

> a <- factor(letters)
> a
 [1] a b c d e f g h i j k l m n o p q r s t u v w x y z
Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z

## Now, even though b only includes five letters, 
## all 26 are listed in the levels
> b <- a[1:5]
> b
[1] a b c d e
Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z

## This behavior can be changed using the following syntax:
> b <- a[1:5,drop = TRUE]
> b
[1] a b c d e
Levels: a b c d e

Another way to deal with this is to use the dropUnusedLevels() command in the Hmisc library. The only issue here is that behavior is changed globally which may have undesired consequences (see the post listed above).

****UPDATE****
As Jeff Hollister mentions in the comments, there is another way to do this:

a<-factor(letters)
b<-factor(a[1:5])

Yet another way, if you are working with data frames that by default convert characters into factors, was suggested on r-sig-ecology by Hadley Wickham:

options(stringsAsFactors = FALSE)
a <-data.frame("alpha"=letters)
b<-a[1:5]

To leave a comment for the author, please follow the link and comment on their blog: Quantitative Ecology.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Drop unused factor levels

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)