Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Let’s make this a quick and quite basic one. There is this incredibly useful function in R called `ifelse()`. It’s basically a vectorized version of an if … else control structure every programming language has in one way or the other. `ifelse()` has, in my view, two major advantages over if … else:

1. It’s super fast.
2. It’s more convenient to use.

The basic idea is that you have a vector of values and whenever you want to test these values against some kind of condition, you want to have a specific value in another vector. An example follows below. First, let’s load the `{rbenchmark}` package to see the speed benefits.

`library(rbenchmark)`

Now, the toy example: I am creating a vector of half a million random normally distributed values. For each of these values, I want to know whether the value is below or above zero.

`x <- rnorm(500000)`

`ifelse()` is used as `ifelse(, , )`, so we need three arguments. My test is `x < 0` and I want to have the string `"negative"` in `y` whenever the corresponding value in `x` is smaller than zero. If this is not the case, then `y` should have a `"positive"` in this position. `ifelse()` only needs one line of code for this.

```benchmark(replications = 50, {
y <- ifelse(x < 0, "negative", "positive")
})\$user.self
##  5.88```

We could also solve this with a `for` loop. But, as you can see, this takes approx. 3 times as long.

```benchmark(replications = 50, {
y <- c()
for (i in x) {
if (i < 0) {
y[length(y)+1] <- "negative"
} else {
y[length(y)+1] <- "negative"
}
}
})\$user.self
##  16.938```

The same is true for an `sapply()` version. `sapply()` even consistently takes a little longer than a `for` loop in this case - to my surprise.

```benchmark(replications = 50, {
y <- sapply(x, USE.NAMES = F, FUN = function (i) {
if (i < 0) {
"negative"
} else {
"positive"
}
}
)
})\$user.self
##  20.423```

It’s highly unlikely that `rnorm()` produces a value of exactly zero. But we could also check for this by simply nesting calls to `ifelse()`. If you want to do this, you simply add another `ifelse()` in the “FALSE” part of the previous `ifelse()` as I did below. In this little toy example, this nested test is still considerably faster than the `for` or `sapply()` versions of the single test.

```benchmark(replications = 50, {
y <- ifelse(x < 0, "negative",
ifelse(x > 0, "positive", "exactly zero"))
})\$user.self
##  12.197```