R and data

[This article was first published on Erehweb's Blog » r, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

My fellow bloggers John and Scott have posted recently about the free statistical programming language R.  How does it compare to an expensive language like SAS?

If you’ve done any statistical analysis, then you’ll know that getting and cleaning the data is a major step in any project.  SAS does a pretty good job at this, and will complain if the data is not in the format you think it is.  As for R, here’s an excerpt from the R FAQ:

7.10 How do I convert factors to numeric?

It may happen that when reading numeric data into R (usually, when reading in a file), they come in as factors. If f is such a factor object, you can use

as.numeric(as.character(f))

to get the numbers back. More efficient, but harder to remember, is

as.numeric(levels(f))[as.integer(f)]

In any case, do not call as.numeric() or their likes directly for the task at hand (as as.numeric() or unclass() give the internal codes).

As one of my favorite musicals says, “It ain’t no joke, that’s why it’s funny”.  Maybe when you do an uncommon operation like reading in a file, your numbers will be silently converted into factors / categorical variables.  Or maybe not.  Ha ha.   But certainly, don’t do anything silly like thinking as.numeric(f) would convert f into numbers you might want.  Ha ha ha.  Oh, and that “more efficient” way of doing things?  It crashes if f was actually numeric to start with.  Ha ha ha ha.  Stop, you’re killing me!  [or at least, my productivity].

To complete the joke, here’s an excerpt from the R manual:

In general, coercion from numeric to character and back again will not be exactly reversible, because of roundoff errors in the character representation.

That’s fair enough.  It’s not as if you have a good reason for doing this, except perhaps when you’re reading numbers in from a file.


To leave a comment for the author, please follow the link and comment on their blog: Erehweb's Blog » r.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)