# A multidimensional “which” function

September 16, 2011
By

(This article was first published on Mark van der Loo, and kindly contributed to R-bloggers)

update Henrik Bengtsson commented that which(x, arr.ind=TRUE) gives the same result, rendering the blog below academic (thanks for the comment!). So, for academic interest, I’ll leave it. In my defense, I implemented this kind of functionality in C some time ago, so I did not even think about RTFM. (So there’s a lesson in this blog after all )

The well-known which function accepts a logical vector and returns the indices where its value equals TRUE. Actually, which also accepts matrices or multidimensional arrays. Internally, R uses a single index to run through such two- or higher-dimensional structures, in a column-first fashion. This is easy for computers, but for us poor humans it is less readable. The following function gives the multi-index of TRUE values into any d1 x d2 x … x dn – dimensional array.

?View Code RSPLUS
 # A which for multidimensional arrays. # Mark van der Loo 16.09.2011 # # A Array of booleans # returns a sum(A) x length(dim(A)) array of multi-indices where A == TRUE # multi.which <- function(A){ if ( is.vector(A) ) return(which(A)) d <- dim(A) T <- which(A) - 1 nd <- length(d) t( sapply(T, function(t){ I <- integer(nd) I[1] <- t %% d[1] sapply(2:nd, function(j){ I[j] <<- (t %/% prod(d[1:(j-1)])) %% d[j] }) I }) + 1 ) }

For example. Let’s create a 2x3x2 logical array (2 rows, three columns, and this structure times 2):

?View Code RSPLUS
 > set.seed(1) > set.seed(1) > (B <- array(sample(c(TRUE,FALSE),12,replace=TRUE),dim=c(2,3,2)) ) , , 1   [,1] [,2] [,3] [1,] TRUE FALSE TRUE [2,] TRUE FALSE FALSE   , , 2   [,1] [,2] [,3] [1,] FALSE FALSE TRUE [2,] FALSE TRUE TRUE

The standard which function gives 1-dimensional indices:

?View Code RSPLUS
 > which(B) [1] 1 2 5 10 11 12

If you don’t need to see the result, this is fine. However, sometimes it is convenient to have the multi-index available. For example, the element in the first row of the first column of the first matrix of B equals TRUE. That is, element (1,1,1). The multi.which function returns all multi-indices where coefficients are TRUE:

?View Code RSPLUS
 > multi.which(B) [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 1 1 [3,] 1 3 1 [4,] 2 2 2 [5,] 1 3 2 [6,] 2 3 2

The result is a 2-dimensional array, where each row is a single multi-index. You can check the last row by confirming that the second row of the third column of the second matrix indeed has coefficient TRUE. As noted, the function works for any multidimensional array (including vectors and matrices).

So, how does it all work? I will just give the basic equation here, but see this paper for a more thorough description and the inverse relation. Basically, you can regard the multi-index as a positional number system, with the first index running fastest. (Remember, that our decimal notation system is a positional system, but with the first number running slowest).

Denote the single index in a d1 x d2 x … x dn – dimensional array with t. The multi-index I may be written as

where

and the product equals 1 if j=1. The symbols div and mod stand for integer division and remainder upon division. This equation assumes base 0 indexing, meaning that both the single and multi-indexing start at 0. Since R uses base 1 indexing, the first and 11th line in multi.which first subtract, then add one to correct for this.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...