Example 8.24: MplusAutomation and Mplus

[This article was first published on SAS and R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In recent entries (here, here, and here), we’ve been fitting a series of latent class models using SAS and R. One of the most commonly used and powerful software package for latent class model estimation is Mplus. This commercial software includes support for many features that are not presently available in R or SAS. As an example, while the randomLCA package supports data with clustering, and the poLCA package supports polytomous variables, neither package supports clustering and polytomous variables.

In this entry, we demonstrate how to use the R package MplusAutomation to automate the process of fitting and interpreting a series of models using Mplus.

The key to all this magic is the template file which is used to create the Mplus input files. Here we demonstrate automating the creation of 4 models with 1, 2, 3, and 4 latent classes, using a template file called mplus.txt.
[[init]]
iterators = classes;
classes = 1:4;
dir = "Z:/field/blog";
filename = "mplus-[[classes]]-class-.inp";
outputDirectory = [[dir]];
[[/init]]
TITLE: [[classes]]-class
DATA: FILE IS mplus.dat;
VARIABLE: NAMES ARE homeless cesdcut satreat linkstatus;
CLASSES = c ([[classes]]);
CATEGORICAL = all;
ANALYSIS: TYPE = MIXTURE;  
STARTS = 2000 200; 
STITERATIONS=1000;
OUTPUT: TECH1 TECH10;
SAVEDATA: FILE IS "mplus-[[classes]]-class.cprob";
SAVE IS CPROB;

The package’s createModels() function will loop through the four possible numbers of classes (1 through 4) and create separate Mplus input files. Multiple iterators are supported, and they can be referenced numerically or symbolically. This can be very helpful if there are different variables being used in each of the models, or other variations in the model.

When the createModels() function is run for this example, it generates 4 files. The file mplus-1-class-.inp looks like:
TITLE: 1-class
DATA: FILE IS mplus.dat;
VARIABLE: NAMES ARE homeless cesdcut satreat linkstatus;
CLASSES = c (1);
CATEGORICAL = all;
ANALYSIS: TYPE = MIXTURE;  
STARTS = 2000 200; 
STITERATIONS=1000;
OUTPUT: TECH1 TECH10;
SAVEDATA: FILE IS "mplus-1-class.cprob";
SAVE IS CPROB;

We call Mplus using the runModels() function after reading in the data and writing out a dataset in Mplus format (with prepareMplusData). Then the results can be collated and displayed.
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
attach(ds)
library(MplusAutomation)
cesdcut = ifelse(cesd>20, 1, 0)
smallds = na.omit(data.frame(homeless, cesdcut, 
  satreat, linkstatus))
prepareMplusData(smallds, file="mplus.dat")
createModels("mplus.txt")
runModels()
summary=extractModelSummaries()
models=readModels()

We see that the three class solution has the lowest AICC, while the one class solution has the lowest aBIC.
> summary
    Title                                   AnalysisType
1 1-class MIXTURE;  STARTS = 2000 200; STITERATIONS=1000
2 2-class MIXTURE;  STARTS = 2000 200; STITERATIONS=1000
3 3-class MIXTURE;  STARTS = 2000 200; STITERATIONS=1000
4 4-class MIXTURE;  STARTS = 2000 200; STITERATIONS=1000
    DataType Estimator Observations Parameters        LL
1 INDIVIDUAL       MLR          431          4 -1045.656
2 INDIVIDUAL       MLR          431          9 -1040.513
3 INDIVIDUAL       MLR          431         14 -1032.484
4 INDIVIDUAL       MLR          431         19 -1032.067
  LLCorrectionFactor      AIC      BIC     aBIC Entropy
1              1.000 2099.313 2115.577 2102.883      NA
2              1.019 2099.026 2135.621 2107.060   0.349
3              1.000 2092.967 2149.893 2105.465   0.941
4              1.000 2102.134 2179.390 2119.095   0.832
      AICC           Filename
1 2099.407 mplus-1-class-.out
2 2099.454 mplus-2-class-.out
3 2093.977 mplus-3-class-.out
4 2103.983 mplus-4-class-.out

Additional results for each of the specific models can be found in the returned objects.
> names(models)
[1] "mplus.1.class..out" "mplus.2.class..out"
[3] "mplus.3.class..out" "mplus.4.class..out"
> names(models$mplus.1.class..out)
[1] "parameters" "savedata"   "summaries" 

In a future entry, we’ll explore more ways to utilize the information in the Mplus output, including displaying the prevalences in each group in a graphical manner.

To leave a comment for the author, please follow the link and comment on their blog: SAS and R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)