A short note on the startsWith function

The startsWith function comes with base R, and determines whether entries of an input start with a given prefix. (The endsWith function does the same thing but for suffixes.) The following code checks if each of “ant”, “banana” and “balloon” starts with “a”:

startsWith(c("ant", "banana", "balloon"), "a")

The second argument (the prefix to check) can also be a vector. The code below checks if “ant” starts with “a” and if “ant” starts with “b”:

startsWith("ant", c("a", "b"))

Where things might get a bit unintuitive is when both arguments are vectors of length >1. Why do you think the line of code below returned the result it did?

startsWith(c("ant", "banana", "balloon"), c("a", "b"))

This makes sense when we look at the documentation for startsWith‘s return value:

A logical vector, of “common length” of x and prefix (or suffix), i.e., of the longer of the two lengths unless one of them is zero when the result is also of zero length. A shorter input is recycled to the output length.

startsWith(x, prefix) checks if x[i] starts with prefix[i] for each i. In our line of code above, the function checks if “ant” starts with “a” and “banana” starts with “b”. Since x had length greater than prefix, we “recycle” prefix and check if “balloon” starts with “a”.

If you want to check if each x[i] starts with any prefix[j] (with j possibly being different from i), we could do the following:

x <- c("ant", "banana", "balloon")
prefix <- c("a", "b")
has_prefix <- sapply(prefix, function(p) startsWith(x, p))
#          a     b
# [1,]  TRUE FALSE
# [2,] FALSE  TRUE
# [3,] FALSE  TRUE

apply(has_prefix, 1, any)
