Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R has something of a reputation for generating, shall we say, obscure error messages like this:

Error in model.frame.default(formula = y ~ female + DNC + SE_region +  : could not find function "function (object, ...) nobject"

One tip for dealing with error messages is to ignore everything between “Error in” and the colon: unless you are running a function that you wrote yourself, only the error message at the end is likely to be useful. If you're still stuck, another tip is to ask for help on Stackoverflow.com using the [r] tag, where you'll find more than 20,000 questions about R error messages.

Noam Ross has analyzed these questions to find the most commonly asked-about R error messages. Naturally, he used the stackr R package to interrogate the StackOverflow API, and downloaded around 10,000 error messages. He then used a regular expression to break the questions down into trigrams (sequences of 3 works) to be able to count which were the most common. On that basis, the most common types of error messages were:

1. “could not find function” errors, usually caused by typos or not loading a required package
2. “Error in if” errors, caused by non-logical data or missing values passed to R's “if” conditional statement
3. “Error in eval” errors, caused by references to objects that don't exist
4. “cannot open” errors, caused by attempts to read a file that doesn't exist or can't be accessed
5. “no applicable method” errors, caused by using an object-oriented function on a data type it doesn't support
6. “subscript out of bounds” errors, caused by trying to access an element or dimension that doesn't exist
7. package errors caused by being unable to install, compile or load a package.

Noam's full analysis is at the link below. In addition to providing insights about R's error messages, the trigram method he uses will be useful to anyone who needs to do frequency analysis on unstructured data.

Noam Ross (github): Common errors in R: An Empirical Investigation