Catching errors in R and trying something else

December 4, 2013
By

(This article was first published on Rcrastinate, and kindly contributed to R-bloggers)

I recently encountered some functionality in R which most of you might already know. Nevertheless, I want to share it here, because it might come in handy for those of you who do not know this yet.

Suppose you want to read in a large number of very large text tables in R. There is the great function fread() in the data.table package, which is really fast in reading in those large tables. However, it is still under development and sometimes it fails (e.g., if there are unbalanced quotes for an entry).

I guess, this will be fixed in the future. In the meantime, I wrote a little function which catches an error and tries something else.

The following function reads in a file (I stored it in one some private webspace for you if you want to try this out) with fread(). It will fail for fread(), but it tries good old read.table() with the appropriate parameter set next. read.table() is much slower but it also works for unbalanced quotes.

The function try() does the trick...

read.file <- function (file.name) {
  require(data.table)
  file <- try(fread(file.name))
  if (class(file) == "try-error") {
    cat("Caught an error during fread, trying read.table.\n")
    file <- as.data.table(read.table(file.name, sep = " ", quote = ""))
  }
  file
}

# Let's try this (excuse the German output)

> read.file("http://www.wolferonline.de/test/test.txt")

versuche URL 'http://www.wolferonline.de/test/test.txt'
Content type 'text/plain' length 72 bytes
URL geöffnet
==================================================
downloaded 72 bytes

Error in fread(file.name) : 
  Unbalanced " observed on this line: "Unbalanced.quotes some.entry some.other.entry

Caught an error during fread, trying read.table.
                  V1         V2               V3
1          No.quotes     entry1           entry2
2 "Unbalanced.quotes some.entry some.other.entry

The cool thing about this: Wether you read in a file or you do something else which has a fast and a slow way to do it, you can first try the fast way. If this fails, you can still try the other, more stable (but slower) way to do it. Also, you can use try() as often as you like. So if the slower way also fails, you can return something which your script can use further on.

Good luck!

To leave a comment for the author, please follow the link and comment on his blog: Rcrastinate.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.