ASReml-R: Storing A inverse as a sparse matrix

December 18, 2010
By

(This article was first published on Gregor Gorjanc (gg), and kindly contributed to R-bloggers)

I was testing ASReml-R program (an R package that links propriety ASReml binaries that can be used only with valid licence) this week and had to do some manipulations with the numerator relationship matrix (A). ASReml-R provides a function (asreml.Ainverse) that can create inverse of A directly from the pedigree as this inverse is needed in pedigree based mixed model. Bulding inverse of A directly from a pedigree is a well known result dating back to Henderson in 1970s or so. The funny thing is that it is cheaper to setup inverse of A directly than to setup up first A and then to invert it. In addition, inverse of A is very spare so it is easy/cheap to store it. Documentation for asreml.Ainverse has a tiny example of usage. Since the result of this function is a list with several elements (data.frame with "triplets" for non-zero elements of inverse of A, inbreeding coefficients, ...) example also shows how to create a matrix object in R as shown bellow:
library(package="asreml")
## Create test pedigree
ped <- data.frame( me=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
dad=c(0, 0, 0, 1, 1, 2, 4, 5, 7, 9),
mum=c(0, 0, 0, 1, 1, 2, 6, 6, 8, 9))
## Create inverse of A in triplet form
tmp <- asreml.Ainverse(pedigree=ped)$ginv
## Create a "proper" matrix
AInv <- asreml.sparse2mat(x=tmp)
## Print AInv
AInv
So the inverse of A would be:
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]      [,9]     [,10]
[1,] 5 0 0 -2 -2 0 0.0 0.0 0.000000 0.000000
[2,] 0 3 0 0 0 -2 0.0 0.0 0.000000 0.000000
[3,] 0 0 1 0 0 0 0.0 0.0 0.000000 0.000000
[4,] -2 0 0 3 0 1 -2.0 0.0 0.000000 0.000000
[5,] -2 0 0 0 3 1 0.0 -2.0 0.000000 0.000000
[6,] 0 -2 0 1 1 4 -2.0 -2.0 0.000000 0.000000
[7,] 0 0 0 -2 0 -2 4.5 0.5 -1.000000 0.000000
[8,] 0 0 0 0 -2 -2 0.5 4.5 -1.000000 0.000000
[9,] 0 0 0 0 0 0 -1.0 -1.0 4.909091 -2.909091
[10,] 0 0 0 0 0 0 0.0 0.0 -2.909091 2.909091
However, this is problematic as it creates a dense matrix - zero values are also stored (you can see them). If we would have 1,000 individuals, such a matrix would consume 7.6 Mb of RAM (= (((1000 * (1000 + 1)) / 2) * 16) / 2^20). This is not a lot, but with 10,000 individuals we would already need 763 Mb of RAM, which can create some bottlenecks. A solution is to create a sparse matrix using the Matrix R package. Luckily we have all the ingredients prepared by asreml.Ainverse function - the triplets. However, the essential R code is a bit daunting and I had to test several options before I figured it out - code from my previous post helped;)
## Load package
library(package="Matrix")
## Number of pedigree members
nI <- nrow(ped)
## Store inverse of A in sparse form
AInv2 <- as(new("dsTMatrix",
Dim=c(nI, nI),
uplo="L",
i=(as.integer(tmp$Row) - 1L),
j=(as.integer(tmp$Column) - 1L),
x=tmp$Ainverse),
"dsCMatrix")
## Add row and column names - optional
dimnames(AInv2) <- list(attr(x=tmp, which="rowNames"),
attr(x=tmp, which="rowNames"))
## Print AInv
AInv2
And the inverse of A is now:
10 x 10 sparse Matrix of class "dsCMatrix"
[[ suppressing 10 column names ‘1’, ‘2’, ‘3’ ... ]]

1 5 . . -2 -2 . . . . .
2 . 3 . . . -2 . . . .
3 . . 1 . . . . . . .
4 -2 . . 3 . 1 -2.0 . . .
5 -2 . . . 3 1 . -2.0 . .
6 . -2 . 1 1 4 -2.0 -2.0 . .
7 . . . -2 . -2 4.5 0.5 -1.000000 .
8 . . . . -2 -2 0.5 4.5 -1.000000 .
9 . . . . . . -1.0 -1.0 4.909091 -2.909091
10 . . . . . . . . -2.909091 2.909091
you can clearly see the structure and it soon becomes obvious why such a storage is more efficient.

If we want to go back from matrix to triplet form (this might be useful if we want to create a matrix for programs as ASReml) we can use the following code:
## Convert back to triplet form - first the matrix
tmp2 <- as(AInv2, "dsTMatrix")
## ... - now to data.frame
tmp3 <- data.frame(Row=tmp2@i + 1, Column=tmp2@j + 1, Ainverse=tmp2@x)
## Sort
tmp3 <- tmp3[order(tmp3$Row, tmp3$Column), ]
## Test that we get the same stuff
any((tmp[, 3] - tmp3[, 3]) > 0)
## ASReml-R specificities
attr(x=tmp3, which="rowNames") <- rownames(tmp)
attr(x=tmp3, which="geneticGroups") <- c(0, 0)

To leave a comment for the author, please follow the link and comment on his blog: Gregor Gorjanc (gg).

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.