# All together now – Confirmatory Factor Analysis in R

December 8, 2010
By

(This article was first published on Sustainable Research » Renglish, and kindly contributed to R-bloggers)

Describing multivariate data is not easy. Especially, if you think that statisticians have not developed any new tools after the ANOVA and principal component analysis (PCA). For social and experimental scientists the most important new technique are structural equation models that combine measurement models (that substitute reliability analysis and PCA) and structural models (that substitute ANOVAs or regressions).

At present three R-packages provide the functionality to extimate structural equation models.

• sem: The first package to provide the ability to fit structural equation models in R.
• OpenMX: Has a large number of active developers, draws up-on a well established code to fit the models (Mx) and can fit non-standard models, and is the first to announce version 1.0.
• lavaan: Aims at a very easy-to-use implementation of SEM that also incorporates advanced techniques (e.g. Full Information Maximum Likelihood Estimation, and multiple-group confirmatory factor analysis).

Today we focus on using structural equation models to fit a measurement model that specifies which items load on which factor. This is similar to what some do with principal component analysis or exploratory factor analysis. If  you already know how the items form the factors you should use CFA, because this gives you several measures of fit and lets you Another advantage is that the SEM-framework provides a framework in which questions of differences between groups can be asked at various levels.

Using lavaan a simple model with two latent variables, each measured with four items, can be fit with the following lines of code.

 1 2 3 4 5 6 7 8 9 10 11 12  library(lavaan) model <- ' # latent variable definitions factor_1 =~ y1 + y2 + y3 + y4 factor_2 =~ y5 + y6 + y7 + y8 # covariance between factor_1 and factor_2 factor_1 ~~ factor_2 # residual covariances y1 ~~ y5 ' fit <- cfa(model, data=ex_data) summary(fit)
The output you get contains all the fit-indeces you love (RMSEA, GFI, CFI…). And as a bonus lavaan has a dedicated function that lets you run a multiple-group confirmatory factor analysis to test for measurement invariance. Something that took me a while in AMOS.
 1 2    measurement.invariance(model, data=ex_data, group ="school" )
Cons:
• lavaan is currently at version 0.3, so one should check it against other programmes.