[This article was first published on R – Statistical Odds & Ends, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R has a colon operator which makes it really easy to define a sequence of integers. For example, the code 1:10 generates a vector of consisting of the integers from 1 to 10 (inclusive). However, using the colon operator is not without its pitfalls! I will highlight two common mistakes here.

First, imagine that you have a variable n which has value 5. What do you think the following code prints out?

for (i in 1:n+1) print(i)


My first instinct is that it should print out the numbers 1, 2, …, 6 (inclusive), with one number on each line. Wrong! Instead, this is the output we get:

[1] 2

 [1] 3 [1] 4 [1] 5 

[1] 6

What is going on here? The problem here is one of operator precedence. Just like how $\times$ and $\div$ come before $+$ and $-$, in R : comes before +. Hence, the code written above is interpreted as

for (i in (1:n)+1) print(i)


which is why the numbers 2 to 6 are printed out instead of the numbers 1 to 5. If we want to print the numbers 1 to n+1 inclusive, put brackets to enforce the correct order for evaluation:

for (i in 1:(n+1)) print(i)


Let’s move on to the second common mistake. Let’s say I have a vector vec and I want to print its elements one by one. The first instinct of most of us would be to write something like this:

for (i in 1:length(vec)) print(vec[i])


This works most of the time, but not all the time. Consider what happens when vec is an empty vector:

vec <- c()
for (i in 1:length(vec)) print(vec[i])


 NULL NULL 

What happened here? The problem is that the colon operator can return a descending sequence of integers! In the code above, length(vec) has value 0, so 1:length(vec) is the same as c(1, 0). It prints out vec[1] and vec[0], which are both NULL.

To avoid this problem, use the seq_along function instead:

for (i in seq_along(vec)) print(vec[i])


You may think that this is not really a big problem; after all, it only fails when we have an empty vector right? Well, there are 2 responses to that. First, you don’t want your code to ever do anything unintended. In this case the mistake was easy to catch; in some cases this mistake might be 3 levels deep in your code which is thousands of lines long— not so easy to catch anymore! The second response is that this mistake will crop up more easily when you don’t start from the first element of the vector.