# Example 8.24: MplusAutomation and Mplus

February 7, 2011
By

(This article was first published on SAS and R, and kindly contributed to R-bloggers)

In recent entries (here, here, and here), we've been fitting a series of latent class models using SAS and R. One of the most commonly used and powerful software package for latent class model estimation is Mplus. This commercial software includes support for many features that are not presently available in R or SAS. As an example, while the randomLCA package supports data with clustering, and the poLCA package supports polytomous variables, neither package supports clustering and polytomous variables.

In this entry, we demonstrate how to use the R package MplusAutomation to automate the process of fitting and interpreting a series of models using Mplus.

The key to all this magic is the template file which is used to create the Mplus input files. Here we demonstrate automating the creation of 4 models with 1, 2, 3, and 4 latent classes, using a template file called mplus.txt.
[[init]]iterators = classes;classes = 1:4;dir = "Z:/field/blog";filename = "mplus-[[classes]]-class-.inp";outputDirectory = [[dir]];[[/init]]TITLE: [[classes]]-classDATA: FILE IS mplus.dat;VARIABLE: NAMES ARE homeless cesdcut satreat linkstatus;CLASSES = c ([[classes]]);CATEGORICAL = all;ANALYSIS: TYPE = MIXTURE;  STARTS = 2000 200; STITERATIONS=1000;OUTPUT: TECH1 TECH10;SAVEDATA: FILE IS "mplus-[[classes]]-class.cprob";SAVE IS CPROB;

The package's createModels() function will loop through the four possible numbers of classes (1 through 4) and create separate Mplus input files. Multiple iterators are supported, and they can be referenced numerically or symbolically. This can be very helpful if there are different variables being used in each of the models, or other variations in the model.

When the createModels() function is run for this example, it generates 4 files. The file mplus-1-class-.inp looks like:
TITLE: 1-classDATA: FILE IS mplus.dat;VARIABLE: NAMES ARE homeless cesdcut satreat linkstatus;CLASSES = c (1);CATEGORICAL = all;ANALYSIS: TYPE = MIXTURE;  STARTS = 2000 200; STITERATIONS=1000;OUTPUT: TECH1 TECH10;SAVEDATA: FILE IS "mplus-1-class.cprob";SAVE IS CPROB;

We call Mplus using the runModels() function after reading in the data and writing out a dataset in Mplus format (with prepareMplusData). Then the results can be collated and displayed.
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")attach(ds)library(MplusAutomation)cesdcut = ifelse(cesd>20, 1, 0)smallds = na.omit(data.frame(homeless, cesdcut,   satreat, linkstatus))prepareMplusData(smallds, file="mplus.dat")createModels("mplus.txt")runModels()summary=extractModelSummaries()models=readModels()

We see that the three class solution has the lowest AICC, while the one class solution has the lowest aBIC.
> summary    Title                                   AnalysisType1 1-class MIXTURE;  STARTS = 2000 200; STITERATIONS=10002 2-class MIXTURE;  STARTS = 2000 200; STITERATIONS=10003 3-class MIXTURE;  STARTS = 2000 200; STITERATIONS=10004 4-class MIXTURE;  STARTS = 2000 200; STITERATIONS=1000    DataType Estimator Observations Parameters        LL1 INDIVIDUAL       MLR          431          4 -1045.6562 INDIVIDUAL       MLR          431          9 -1040.5133 INDIVIDUAL       MLR          431         14 -1032.4844 INDIVIDUAL       MLR          431         19 -1032.067  LLCorrectionFactor      AIC      BIC     aBIC Entropy1              1.000 2099.313 2115.577 2102.883      NA2              1.019 2099.026 2135.621 2107.060   0.3493              1.000 2092.967 2149.893 2105.465   0.9414              1.000 2102.134 2179.390 2119.095   0.832      AICC           Filename1 2099.407 mplus-1-class-.out2 2099.454 mplus-2-class-.out3 2093.977 mplus-3-class-.out4 2103.983 mplus-4-class-.out

Additional results for each of the specific models can be found in the returned objects.
> names(models)[1] "mplus.1.class..out" "mplus.2.class..out"[3] "mplus.3.class..out" "mplus.4.class..out"> names(models\$mplus.1.class..out)[1] "parameters" "savedata"   "summaries"

In a future entry, we'll explore more ways to utilize the information in the Mplus output, including displaying the prevalences in each group in a graphical manner.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...