A multidimensional "which" function

[This article was first published on yaRb, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The well-known which function accepts a logical vector and returns the indices where its value equals TRUE. Actually, which also accepts matrices or multidimensional arrays. Internally, R uses a single index to run through such two- or higher-dimensional structures, in a column-first fashion. This is easy for computers, but for us poor humans it is less readable. The following function gives the multi-index of TRUE values into any d1 x d2 x … x dn – dimensional array.
# A which for multidimensional arrays.
# Mark van der Loo 16.09.2011
# A Array of booleans
# returns a sum(A) x length(dim(A)) array of multi-indices where A == TRUE
multi.which <- function(A){
    if ( is.vector(A) ) return(which(A))
    d <- dim(A)
    T <- which(A) - 1
    nd <- length(d)
    t( sapply(T, function(t){
        I <- integer(nd)
        I[1] <- t %% d[1]
        sapply(2:nd, function(j){
            I[j] <<- (t %/% prod(d[1:(j-1)])) %% d[j]
    }) + 1 )

For example. Let's create a 2x3x2 logical array (2 rows, three columns, and this structure times 2):
> set.seed(1)
> (B <- array(sample(c(TRUE,FALSE),12,replace=TRUE),dim=c(2,3,2)) )
, , 1
     [,1]  [,2]  [,3]
, , 2
      [,1]  [,2] [,3]

The standard which function gives 1-dimensional indices:
> which(B)
[1]  1  2  5 10 11 12

If you don't need to see the result, this is fine. However, sometimes it is convenient to have the multi-index available. For example, the element in the first row of the first column of the first matrix of B equals TRUE. That is, element (1,1,1). The multi.which function returns all multi-indices where coefficients are TRUE:
> multi.which(B)
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    1    1
[3,]    1    3    1
[4,]    2    2    2
[5,]    1    3    2
[6,]    2    3    2

The result is a 2-dimensional array, where each row is a single multi-index. You can check the last row by confirming that the second row of the third column of the second matrix indeed has coefficient TRUE. As noted, the function works for any multidimensional array (including vectors and matrices).

So, how does it all work? I will just give the basic equation here, but see this paper for a more thorough description and the inverse relation. Basically, you can regard the multi-index as a positional number system, with the first index running fastest. (Remember, that our decimal notation system is a positional system, but with the first number running slowest).

Denote the single index in a d1 x d2 x ... x dn - dimensional array with t. The multi-index I may be written as

I(t) = (i1,i2,...,in )


ij=(t div Πk=1j-1dk) mod dj,

and the product equals 1 if j=1. The symbols div and mod stand for integer division and remainder upon division. This equation assumes base 0 indexing, meaning that both the single and multi-indexing start at 0. Since R uses base 1 indexing, the first and 11th line in multi.which first subtract, then add one to correct for this.

To leave a comment for the author, please follow the link and comment on their blog: yaRb.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)