Tricks I learned today #1: as.integer() on factor levels

October 12, 2011

(This article was first published on chem-bla-ics, and kindly contributed to R-bloggers)

I normally work with full numerical data, not categorical data. R, when using read.csv() seems to recognize such categories and marks the column as to have factor levels. This is useful indeed. However, I wanted to make a PCA biplot on this data, so was looking for ways to convert this to class numbers. After some googling we, Anna and me, ran into as.integer() which can be used on the factor levels. So, today I learned this trick:

> a = as.factor(c(“A”, “B”, “A”, “C”))
> b = as.integer(factor(a))

Well, probably basic to many, it was new to me 🙂

Now, wondering if it is equally easy to convert it into a multi-column matrix where each column indicates class membership (thus, resulting in three columns for the above…). That’s another trick I need to learn…

To leave a comment for the author, please follow the link and comment on their blog: chem-bla-ics. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)