# YAPOEH! (Yet another post on error handling)

**R | Adi Sarid**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Error catching can be hard to catch at times (no pun intended). If you’re not used to error handling, this short post might help you do it elegantly.

There are many posts about error handling in R (and in fact the examples in the `purrr`

package documentation are not bad either). In this sense, this post is not original.

However, I do demonstrate two approaches: both the base-R approach (`tryCatch`

) and the `purrr`

approach (`safely`

and `possibly`

). The post contains a concise summary of the two methods, with a very simple example.

In this post I’m assuming you’re familiar with the basic concepts of functional programming.

## A simple example for a function with errors

Let’s analyze a very simple function which divides `number`

by 2. The function should return an error if its input is not an actual number, otherwise it will return `number/2`

. This function looks like this:

divide2 <- function(number){ # make sure the input is numeric (otherwise yield an error) if (!is.numeric(number)) { stop(paste(number, "is not a number!")) } # everything is fine, return the result number/2 } divide2(10) ## [1] 5 divide2(pi) ## [1] 1.570796

But trying a string will yield an error:

divide2("foobar") Error in divide2("foobar") : foobar is not a number!

## Where is the problem?

What happens if we have a dataframe (or a list, or any other object for that matter) and we want to try to divide numbers within that object? Invalid values might crash our attempt completely.

For example, this loops through two values (10 and \(\pi\)) and yields their division by 2.

library(purrr) map(list(10, pi), divide2) ## [[1]] ## [1] 5 ## ## [[2]] ## [1] 1.570796

But the next version fails completely, because it tries to loop through “foobar” which cannot be divided.

map(list("foobar", 10, pi), divide2) Error in .f(.x[[i]], ...) : foobar is not a number!

No matter where we place the invalid value “foobar”, it will fail our code completely, and we get nothing.

## The solution: a `safely`

/`possibly`

wrapper

Fortunately, there are very simple wrappers which can help us handle the errors elegantly. These are the two functions from the `purrr`

package: `safely`

and `possibly`

.

We’ll first demonstrate the simpler version, `possibly`

. It allows us to replace errors with a chosen value. Since we expect a number, let’s replace errors with `NA_real_`

(which is like saying an unknown value which is a real number).

possibly_divide2 <- possibly(divide2, otherwise = NA_real_, quiet = TRUE)

The `quiet = TRUE`

argument is just so errors will not be printed during the loop. Now we are ready to try our error safe variation.

possibly_out <- map(list("foobar", 10, pi), possibly_divide2) possibly_out ## [[1]] ## [1] NA ## ## [[2]] ## [1] 5 ## ## [[3]] ## [1] 1.570796

We can also use `unlist`

to turn the output into a simple vector, i.e. `unlist(possibly_out)`

will yield the vector *NA, 5, 1.5707963*.

The `safely`

variation has some more strength into it, since it also provides the error messages. The output is slightly more complex though.

safely_divide2 <- safely(.f = divide2, otherwise = NA_real_, quiet = T) safely_out <- map(list("foobar", 10, pi), safely_divide2) safely_out ## [[1]] ## [[1]]$result ## [1] NA ## ## [[1]]$error ## <simpleError in .f(...): foobar is not a number!> ## ## ## [[2]] ## [[2]]$result ## [1] 5 ## ## [[2]]$error ## NULL ## ## ## [[3]] ## [[3]]$result ## [1] 1.570796 ## ## [[3]]$error ## NULL

Following the approach suggested in the documentation of `safely`

(in the examples section), we can use `transpose()`

and `simplify_all()`

to arrange the output.

simply_safe <- safely_out %>% transpose() %>% simplify_all() simply_safe$result # for the output values ## [1] NA 5.000000 1.570796 simply_safe$error # for the error messages ## [[1]] ## <simpleError in .f(...): foobar is not a number!> ## ## [[2]] ## NULL ## ## [[3]] ## NULL

## For comparison’s sake, how would you `tryCatch`

it?

The function `tryCatch`

is a base-R approach for error handling. The concept is similar but the syntax is different. Here is an example of how to build a safe `divide2`

function with `tryCatch`

:

try_divide2 <- function(number){ tryCatch(expr = { divide2(number) }, error = function(e){ NA_real_ } ) } map(list("foobar", pi, 10), try_divide2) ## [[1]] ## [1] NA ## ## [[2]] ## [1] 1.570796 ## ## [[3]] ## [1] 5

As said - same output, different syntax. Choose your approach. As an appetizer, the same works with base-R functional programming as well. For example, with `lapply`

it will look like this:

lapply(list("foobar", pi, 10), try_divide2) ## [[1]] ## [1] NA ## ## [[2]] ## [1] 1.570796 ## ## [[3]] ## [1] 5

## Bonus: a benchmark

Before going off on your merry, error handled way, I also provide a short comparison between `safely`

, `possibly`

, and `tryCatch`

.

Let’s assume a data set with 20% errors. For simplicity, we’ll use a modified `log`

instead of our `divide2`

(similar to the example provided in `purrr`

’s documentation). It makes sense to fail `log`

in an all numeric vector (yield an error if there is a negative value). Since a negative log is `NaN`

(not an error but rather a warning) I’m creating an `error_prone_log`

function.

library(tibble) library(dplyr) library(microbenchmark) library(ggplot2) error_prone_log <- function(x){ if (x < 0){ stop("Negative x") } log(x) } safe_log <- safely(error_prone_log, otherwise = NaN, quiet = T) possible_log <- possibly(error_prone_log, otherwise = NaN, quiet = T) tryCatch_log <- function(x){ tryCatch(expr = error_prone_log(x), error = function(e){ NaN }) } set.seed(42) test_values <- sample( c(runif(n = 80, min = 0, max = 100), runif(n = 20, min = -100, max = 0)), size = 100, replace = FALSE) bench_results <- microbenchmark(map(test_values, safe_log), map(test_values, possible_log), map(test_values, tryCatch_log), times = 1000) gt::gt(summary(bench_results)) %>% gt::fmt_number(columns = 2:7, decimals = 2)

expr | min | lq | mean | median | uq | max | neval | cld |
---|---|---|---|---|---|---|---|---|

map(test_values, safe_log) | 3.96 | 4.08 | 4.55 | 4.16 | 4.94 | 10.10 | 1000 | c |

map(test_values, possible_log) | 3.74 | 3.86 | 4.27 | 3.92 | 4.64 | 9.60 | 1000 | b |

map(test_values, tryCatch_log) | 2.97 | 3.05 | 3.45 | 3.09 | 3.68 | 73.99 | 1000 | a |

autoplot(bench_results) + ggtitle("Comparison of safely, possibly, and tryCatch") + coord_flip(ylim = c(1.5, 15))

As can be seen from the benchmark’s results, there is no doubt about it: `tryCatch`

is the fastest. The `purrr`

functions `safely`

and `possibly`

take longer to run. In my opinion though they are slightly more convenient in terms of syntax.

In addition, the `safely`

variation allows us to retrieve the error messages conveniently for later examination. Again, this capability comes with a price when compared to `tryCatch`

, but it is roughly the same run time when compared to `possibly`

.

## Conclusion

When you have a long looping process which is prone to errors, for example a pricey web API call or a really large data set, you should aim to catch and handle errors gracefully instead of hoping for the best.

If you really need to be efficient, it’s probably worth while to `tryCatch`

, and if your looking more for ease of use and code readability you should use `possibly`

. In case you need to examine the error messages more thoroughly, use `safely`

.

Good luck hunting down the errors!

**leave a comment**for the author, please follow the link and comment on their blog:

**R | Adi Sarid**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.