Survive R

[This article was first published on Win-Vector Blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

New PDF slides version (presented at the Bay Area R Users Meetup October 13, 2009).

We at Win-Vector LLC appear to like R a bit more than some of our, perhaps wiser, colleagues ( see: Choose your weapon: Matlab, R or something else? and R and data ). While we do like R (see: Exciting Technique #1: The “R” language ) we also understand the need to defend oneself against the abuse regularly dished out by R. Here we will quickly share a few fighting techniques.

If you are not already using R the following will not mean much. If you are using R this may scratch a few itches.

  • First: Write down everything- keep notes in a separate file.

    When you do figure out how to do something in R it will be concise, powerful and completely un-mnemonic and impossible to find again through the help system.

  • Second: Find some way to search for R answers.

    http://stackoverflow.com/questions/102056/how-to-search-for-r-materials

  • Third: Learn unclass().

    # Here is an example of fitting a linear model (from the help(glm) documentation)
    ## Dobson (1990) Page 93: Randomized Controlled Trial :
    > counts outcome treatment glm.D93

    > model

    The model is now a harmless list without a bunch of pesky methods hiding the information.

  • Fourth: learn how to list class and methods.

    Often one of methods(), showMethods() or getS3Method() can show you what methods are on a class or object. Be prepared to try them all as they apply in different contexts.

    # lets make a tricky function
    > fe fe.formula fe.numeric

    How will anyone figure out what we have done?

    > class(fe)
    [1] "function"

    > methods(fe)
    # [1] fe.formula fe.numeric

    > getS3method('fe','numeric')
    # fe.numeric

  • Fifth: Learn to stomp out attributes.

    Ever have this crud follow you around?

    > m m
    Mean
    1.5

    Ah that’s cute: a little “Mean” tag is following the data around. But what if we try to use this value:

    > m*m
    Mean
    2.25

    Okay, now the “Mean” tag has outstayed its welcome. The fix:

    > attributes(m) m*m
    [1] 2.25

    MUCH better.

  • Sixth: Swallow your pride.

    My example: does R have map structures? I have no idea and I am too ashamed to ask. However I know I can fake it with environments (which may be “the R way to do this” or may be “a horrible abuse of the language”- I have no idea which).

    > map assign('dog',7,map)
    > ls(map)
    [1] "dog"
    > get('dog',envir=map)
    [1] 7

    That (nearly) gives you maps with string keys. For maps with numeric keys we can fake something else up with findInterval(). For maps from generic comparable objects keys- I have no idea how you would trick R into helping. This is one reason we like to separate out all data-preparation into a pre-processing step implemented in Java or SQL.

    Note important correction from Eward Ratzer: use “map <- new.env(hash=TRUE,parent=emptyenv()), see comments.

  • Seventh: Find and rely on “the one-liners.”

    Reading in an entire comma separated file in a single line ( read.table() ), re-aggregating data ( table() or doBy’s summaryBy() command ) or building an empirical density ( ecdf() ) in a single line of code is an experience not to be missed.

The overall all point is that while R has some (unnecessarily) sharp edges and pain-points it is a powerful tool worth using. I would much rather struggle through a minor R-language issue when trying to prepare my data than to do without the many special functions, distributions, fitters and plotters built into the R system.

Related posts:

  1. R examine objects tutorial
  2. Exciting Technique #1: The “R” language.
  3. R annoyances

To leave a comment for the author, please follow the link and comment on their blog: Win-Vector Blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)