The most common R error messages

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R has something of a reputation for generating, shall we say, obscure error messages like this:

Error in model.frame.default(formula = y ~ female + DNC + SE_region +  : could not find function "function (object, ...) nobject"

One tip for dealing with error messages is to ignore everything between “Error in” and the colon: unless you are running a function that you wrote yourself, only the error message at the end is likely to be useful. If you're still stuck, another tip is to ask for help on Stackoverflow.com using the [r] tag, where you'll find more than 20,000 questions about R error messages.

Noam Ross has analyzed these questions to find the most commonly asked-about R error messages. Naturally, he used the stackr R package to interrogate the StackOverflow API, and downloaded around 10,000 error messages. He then used a regular expression to break the questions down into trigrams (sequences of 3 works) to be able to count which were the most common. On that basis, the most common types of error messages were:

  1. “could not find function” errors, usually caused by typos or not loading a required package
  2. “Error in if” errors, caused by non-logical data or missing values passed to R's “if” conditional statement
  3. “Error in eval” errors, caused by references to objects that don't exist
  4. “cannot open” errors, caused by attempts to read a file that doesn't exist or can't be accessed
  5. “no applicable method” errors, caused by using an object-oriented function on a data type it doesn't support
  6. “subscript out of bounds” errors, caused by trying to access an element or dimension that doesn't exist
  7. package errors caused by being unable to install, compile or load a package.

Noam's full analysis is at the link below. In addition to providing insights about R's error messages, the trigram method he uses will be useful to anyone who needs to do frequency analysis on unstructured data.

Noam Ross (github): Common errors in R: An Empirical Investigation

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)