GBLUP example in R

March 30, 2012
By

(This article was first published on Gregor Gorjanc (gg), and kindly contributed to R-bloggers)

Shirin Amiri was asking about GBLUP (genomic BLUP) and based on her example I set up the following R script to show how GBLUP works. Note that this is the so called marker model, where we estimate allele substitution effects of the markers and not individual based model, where genomic breeding values are inferred directly. The code:

library(package="MatrixModels")
 
dat <- data.frame( y=c(1.5, 2.35, 3.4, 2.31, 1.53),
s1=c("AA", "Aa", "Aa", "AA", "aa"),
s2=c("Bb", "BB", "bb", "BB", "bb"),
s3=c("cc", "Cc", "Cc", "cc", "CC"),
s4=c("Dd", "Dd", "DD", "dd", "Dd"))
 
cols <- paste("s", 1:4, sep="")
dat[, cols] <- lapply(dat[, cols], function(z) as.integer(z) - 1)
 
lambda <- 1
 
(y <- dat$y)
(X <- model.Matrix(y ~ 1, data=dat, sparse=TRUE))
(Z <- model.Matrix(y ~ -1 + s1 + s2 + s3 + s4, data=dat, sparse=TRUE))
 
XX <- crossprod(X)
XZ <- crossprod(X, Z)
ZZ <- crossprod(Z)
 
Xy <- crossprod(X, y)
Zy <- crossprod(Z, y)
 
(LHS <- rBind(cBind(XX, XZ),
cBind(t(XZ), ZZ + Diagonal(n=dim(ZZ)[1]) * lambda)))
 
(RHS <- rBind(Xy,
Zy))
 
(sol <- solve(LHS, RHS))
 
(GEBV <- Z %*% sol[-1])

And the transcript:

R version 2.14.2 (2012-02-29)
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)
 
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
 
Natural language support but running in an English locale
 
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
 
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
 
>
> library(package="MatrixModels")
Loading required package: Matrix
Loading required package: lattice
 
Attaching package: ‘Matrix
 
The following object(s) are masked from ‘package:base’:
 
det
 
 
Attaching package: ‘MatrixModels’
 
The following object(s) are masked from ‘package:stats’:
 
getCall
 
>
> dat <- data.frame( y=c(1.5, 2.35, 3.4, 2.31, 1.53),
+ s1=c("AA", "Aa", "Aa", "AA", "aa"),
+ s2=c("Bb", "BB", "bb", "BB", "bb"),
+ s3=c("cc", "Cc", "Cc", "cc", "CC"),
+ s4=c("Dd", "Dd", "DD", "dd", "Dd"))
>
> cols <- paste("s", 1:4, sep="")
> dat[, cols] <- lapply(dat[, cols], function(z) as.integer(z) - 1)
>
> lambda <- 1
>
> (y <- dat$y)
[1] 1.50 2.35 3.40 2.31 1.53
> (X <- model.Matrix(y ~ 1, data=dat, sparse=TRUE))
"dsparseModelMatrix": 5 x 1 sparse Matrix of class "dgCMatrix"
(Intercept)
1 1
2 1
3 1
4 1
5 1
@ assign: 0
@ contrasts:
named list()
> (Z <- model.Matrix(y ~ -1 + s1 + s2 + s3 + s4, data=dat, sparse=TRUE))
"dsparseModelMatrix": 5 x 4 sparse Matrix of class "dgCMatrix"
s1 s2 s3 s4
1 2 1 . 1
2 1 2 1 1
3 1 . 1 2
4 2 2 . .
5 . . 2 1
@ assign: 1 2 3 4
@ contrasts:
named list()
>
> XX <- crossprod(X)
> XZ <- crossprod(X, Z)
> ZZ <- crossprod(Z)
>
> Xy <- crossprod(X, y)
> Zy <- crossprod(Z, y)
>
> (LHS <- rBind(cBind(XX, XZ),
+ cBind(t(XZ), ZZ + Diagonal(n=dim(ZZ)[1]) * lambda)))
5 x 5 sparse Matrix of class "dgCMatrix"
(Intercept) s1 s2 s3 s4
(Intercept) 5 6 5 4 5
s1 6 11 8 2 5
s2 5 8 10 2 3
s3 4 2 2 7 5
s4 5 5 3 5 8
>
> (RHS <- rBind(Xy,
+ Zy))
5 x 1 Matrix of class "dgeMatrix"
[,1]
(Intercept) 11.09
s1 13.37
s2 10.82
s3 8.81
s4 12.18
>
> (sol <- solve(LHS, RHS))
5 x 1 Matrix of class "dgeMatrix"
[,1]
(Intercept) 1.65468864
s1 0.05223443
s2 0.08655678
s3 -0.05223443
s4 0.45586081
>
> (GEBV <- Z %*% sol[-1])
5 x 1 Matrix of class "dgeMatrix"
[,1]
1 0.6468864
2 0.6289744
3 0.9117216
4 0.2775824
5 0.3513919
>
> proc.time()
user system elapsed
2.330 0.070 2.412

To leave a comment for the author, please follow the link and comment on their blog: Gregor Gorjanc (gg).

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)