R Tip: Use seqi() For Indexes

Posted on January 11, 2019 by John Mount in R bloggers | 0 Comments

[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R Tip: use seqi() for indexing.

R‘s “1:0 trap” is a mal-feature that confuses newcomers and is a reliable source of bugs. This note will show how to use seqi() to write more reliable code and document intent.

The issue is, contrary to expectations (formed in working with other programming languages) the sequence 1:0 is not empty. It is instead a decreasing sequence. Data scientists typically work in many languages, so we should expect differences. However having a sequence builder that returns empty when the bounds cross is a common useful tool for controlling loops and other indexing tasks.

We have written about this before. The usual defense is that it is the same as seq(1, 0), but I see that more as a doubling-down than an argument. Also due to odd behavior when iterating over vectors or lists with class-attributes, we sometimes must introduce indices (as it isn’t always safe to directly iterate over contents in R).

What this means is in R there is no common safe, succinct way to write index vector or loops where one of the end-points is passed in as an argument. For example the following simple example is incorrect.

# sum reciprocals of squares of positive integers from 1 up to k
# converges to pi^2/6
sum_sq_recip_k <- function(k) {
  sum(1/((1:k)^2))
}

# should be zero, as the convention 1 up to -1 is the empty set
sum_sq_recip_k(-1)
# [1] Inf

There are plenty of ways to write reversed sequences (such as rev(0:1)), so writing reversed sequences isn’t a great unmet need. Previously we recommended using seq_len() as a solution. This is still good, however that only directly addressed upper-bound issues. For general ranges (where perhaps the lower-bound is the parameter) we still have a problem.

Python is one of the most popular programming languages, and it supplies a convenient function for the common task of iterating over increasing ranges of integers.

# Python code

[k for k in range(3, 5)]
# Out[1]: [3, 4]

[k for k in range(5, 3)]
# Out[2]: []

Now of course different programming languages made different choices. However, in my opinion, writing possibly empty sequences parametrically is a common programming need and it is nice to have this be convenient.

Our current advice to R users is use wrapr::seqi() which stands for “sequence, increasing integer(s)”. We needed such a capability when translating C++ code to R code for our RcppDynProg example (otherwise we would have to put guards around the loops so they don’t activate on what should be empty sequences).

seqi() is used as follows.

library("wrapr")

# print 3, 4, and then 5
for(i in seqi(3, 5)) {
   print(i)
}
#> [1] 3
#> [1] 4
#> [1] 5

# empty
for(i in seqi(5, 2)) {
   print(i)
}

This is clear, safe, and documents intent. It is a non-negotiable fact that in R base::seq(1,0) is [0, 1]. Well, wrapr::seqi(1,0) is [].

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

R Tip: Use seqi() For Indexes

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)