Site icon R-bloggers

Catching errors in R and trying something else

[This article was first published on Rcrastinate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I recently encountered some functionality in R which most of you might already know. Nevertheless, I want to share it here, because it might come in handy for those of you who do not know this yet.

Suppose you want to read in a large number of very large text tables in R. There is the great function fread() in the data.table package, which is really fast in reading in those large tables. However, it is still under development and sometimes it fails (e.g., if there are unbalanced quotes for an entry).

I guess, this will be fixed in the future. In the meantime, I wrote a little function which catches an error and tries something else.

The following function reads in a file (I stored it in one some private webspace for you if you want to try this out) with fread(). It will fail for fread(), but it tries good old read.table() with the appropriate parameter set next. read.table() is much slower but it also works for unbalanced quotes.

The function try() does the trick…

read.file <- function (file.name) {
  require(data.table)
  file <- try(fread(file.name))
  if (class(file) == “try-error”) {
    cat(“Caught an error during fread, trying read.table.\n”)
    file <- as.data.table(read.table(file.name, sep = ” “, quote = “”))
  }
  file
}

# Let’s try this (excuse the German output)

> read.file(“http://www.wolferonline.de/test/test.txt”)

versuche URL ‘http://www.wolferonline.de/test/test.txt’
Content type ‘text/plain’ length 72 bytes
URL geöffnet
==================================================
downloaded 72 bytes

Error in fread(file.name) : 
  Unbalanced ” observed on this line: “Unbalanced.quotes some.entry some.other.entry

Caught an error during fread, trying read.table.
                  V1         V2               V3
1          No.quotes     entry1           entry2
2 “Unbalanced.quotes some.entry some.other.entry

The cool thing about this: Wether you read in a file or you do something else which has a fast and a slow way to do it, you can first try the fast way. If this fails, you can still try the other, more stable (but slower) way to do it. Also, you can use try() as often as you like. So if the slower way also fails, you can return something which your script can use further on.

Good luck!

To leave a comment for the author, please follow the link and comment on their blog: Rcrastinate.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.