Tensor Algebra: Efficient Operations on Multidimensional Arrays with R

January 25, 2013

(This article was first published on Scientific Memo, and kindly contributed to R-bloggers)

 Multidimensional arrays are ubiquitous. Any complex problem having multivariate observables would easily generate a need to represent corresponding data in multidimensional arrays. Most of the practitioners would choose to apply operations on these objects with an index notation.  For example a 3-D array would have 3 indices A[i, j, k] or A[i][j][k] depending on the syntax. However there are languages which takes array operations as their native component, for example APL. More familiar examples are matrix operations in MATLAB, Fortran 90 and alike. Of course this concept is noting more than vectorisation.

Mathematical objects called tensors can be used to represent multidimensional objects. In reality a scalar is rank 0 tensor,  so scalar is the simplest tensor. Rank of a tensor can be thought as objects spatial dimension.  Let’s consider matrix multiplication, where there are two indices, tensor of rank 2. If we have two matrices $A_{ij}$ and $B_{ij}$. We can multiply them with index notation as follows using an obscure language R with a naive algorithm:

> set.seed(42)
> A <- matrix(rnorm(4), 2, 2)
> B <- matrix(rnorm(4), 2, 2)
> C <- matrix(0, 2, 2)
> for(i in 1:2) {
+ for(j in 1:2) {
+ for(k in 1:2) {
+ C[i,j] = C[i,j] + A[i, k] * B[k,j]
+ }
+ }
+ }
> C
[,1] [,2]
[1,] 6.413566 -2.5178879
[2,] -2.249789 0.4908884
> A %*% B
[,1] [,2]
[1,] 0.5156982 2.0378605
[2,] -0.2954518 -0.9134599

You should have noticed that we introduce a third index. That is called a dummy index which we actually do the sum. This approach is technically called Einstein summation rule. So we represent the multiplication as follows: $C_{ij} = \sum_{k=1}^{m} A_{ik} B_{kj}$. Normally sum is dropped for short notation. It is quite intuitive from this notation that number of columns of A should be equal to number of rows in B to perform the sum consistently.

There are software packages that can do symbolic and numerical tensor algebra . Here we demonstrate an R package called tensorA developed by Professor van den Boogaart of TU Freiberg. We can repeat the matrix multiplication by using this package as follows.

> library("tensorA")
> set.seed(42)
> At <- to.tensor(rnorm(4), c(i=2, k=2))
> Bt <- to.tensor(rnorm(4), c(k=2, j=2))
> At %e% Bt
i [,1] [,2]
[1,] 0.5156982 2.0378605
[2,] -0.2954518 -0.9134599
[1] "tensor" "matrix"

Note that here $\%e\%$ implies tensor multiplication with Einstein summation rule. This is pretty trivial example but if you imagine higher dimensional objects, tracking indices would be cumbersome so multi-linear algebra makes life easy.

In conclusion,  I think,  using tensor arithmetic for multidimensional arrays looks more compacts and efficient (~ 2-3 times).  A discussion related to this appeared in R help list.

To leave a comment for the author, please follow the link and comment on their blog: Scientific Memo.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)