# Creating a Covariance Matrix from Scratch

January 7, 2013
By

(This article was first published on PsychoAnalytix Blog, and kindly contributed to R-bloggers)

I have been conducting several simulations that use a covariance matrix.  I needed to expand the code that I found in the psych package to have more than 2 latent variables (the code probably allows it but I didn’t figure it out).  I ran across Joreskog’s 1971 paper and realized that I could use the confirmatory factor analysis model equation to build the population covariance matrix.

The code below demonstrates a 5 factor congeneric data structure

fx is the factor loading matrix, err has the error variances on the diagonal of an empty matrix, and phi is a matrix of the correlations between the latent variables.

#######################################
###---Population Covariance Generation
#######################################
fx<-t(matrix(c(
.5,0,0,0,0,
.6,0,0,0,0,
.7,0,0,0,0,
.8,0,0,0,0,
0,.5,0,0,0,
0,.6,0,0,0,
0,.7,0,0,0,
0,.8,0,0,0,
0,0,.5,0,0,
0,0,.6,0,0,
0,0,.7,0,0,
0,0,.8,0,0,
0,0,0,.5,0,
0,0,0,.6,0,
0,0,0,.7,0,
0,0,0,.8,0,
0,0,0,0,.5,
0,0,0,0,.6,
0,0,0,0,.7,
0,0,0,0,.8), nrow=5))

###--Error Variances
err<-diag(c(.6^2,.7^2,.8^2,.9^2,
.6^2,.7^2,.8^2,.9^2,
.6^2,.7^2,.8^2,.9^2,
.6^2,.7^2,.8^2,.9^2,
.6^2,.7^2,.8^2,.9^2))

###---5x5 matrix of factor covariances
phi<-matrix(c(rep(.3, 25)), nrow=5)
diag(phi)<-1

sigma<-(fx%*%phi%*%t(fx)+err)

######################################

For sample data I used the mvrnorm() function from the MASS package

library(MASS)
mvrnorm(100, nrow(fx),sigma)

To simulate parallel form data the values in the fx matrix need to be the same and the diagonal in the err matrix need to be the same. One could also manipulate the phi matrix and thus change the correlations between the latent variables.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...