From Lavaan to OpenMx

Posted on October 2, 2012 by Timothy Bates in R bloggers | 0 Comments

[This article was first published on Industrial Code Workshop, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Joel Caldwell gave an example of Confirmatory factor analysis in lavaan.

The purpose of this post is to show you how to express this model in OpenMx.

The data being modeled are in an object called ratings (see below for the code to create this object as described here.

Here’s Joe’s example structural equation model, as implemented in lavaan:

 require(lavaan)
 bifactor <- "
 general.factor =~ Easy_Reservation + Preferred_Seats
  + Flight_Options + Ticket_Prices
 + Seat_Comfort + Seat_Roominess
 + Overhead_Storage + Clean_Aircraft
 + Courtesy + Friendliness + Helpfulness + Service
 
 ticketing =~ Easy_Reservation + Preferred_Seats + Flight_Options + Ticket_Prices
 aircraft  =~ Seat_Comfort + Seat_Roominess + Overhead_Storage + Clean_Aircraft
 service   =~ Courtesy + Friendliness + Helpfulness + Service"
 
 modelfit <- cfa(bifactor, data=ratings[,1:12], orthogonal=T)
 summary(modelfit, fit.measures = TRUE, standardize = T)

 bifactor <- "
 general =~ Easy_Reservation + Preferred_Seats 
  + Flight_Options + Ticket_Prices
 + Seat_Comfort + Seat_Roominess + Overhead_Storage 
 + Clean_Aircraft + Courtesy + Friendliness
 + Helpfulness + Service
 ticketing =~ Easy_Reservation + Preferred_Seats + Flight_Options + Ticket_Prices
 aircraft  =~ Seat_Comfort + Seat_Roominess + Overhead_Storage + Clean_Aircraft
 service   =~ Courtesy + Friendliness + Helpfulness + Service
 Satisfaction ~ general + ticketing + aircraft + service
 "
 satfit <- sem(bifactor, data=ratings[,c(1:12,13)], orthogonal=T)
 summary(satfit, fit.measures = T, standardize = T)
 inspect(satfit, "rsquare")

And now the same model in OpenMx

 library(OpenMx)
 # make up some helpful lists of items we will need
 serviceVars     = c("Courtesy", "Friendliness", "Helpfulness", "Service")
 aircraftVars    = c("Seat_Comfort", "Seat_Roominess", "Overhead_Storage", "Clean_Aircraft")
 ticketingVars   = c("Easy_Reservation", "Preferred_Seats", "Flight_Options", "Ticket_Prices")
 allMeasuredVars = c(serviceVars, aircraftVars, ticketingVars)
 allLatentVars   = c("general_factor", "ticketing", "aircraft","service")

 bifactor <- mxModel("bifactor", type="RAM",
 manifestVars = allMeasuredVars,
 latentVars   = allLatentVars,
 mxPath(from = allLatentVars  , arrows = 2, free = F, values = 1), # latents fixed@1
 mxPath(from = allMeasuredVars, arrows = 2), # manifest residuals
 # Factor loadings
 mxPath(from = "general_factor", to = allMeasuredVars),
 mxPath(from = "ticketing"     , to = ticketingVars),
 mxPath(from = "aircraft"      , to = aircraftVars),
 mxPath(from = "service"       , to = serviceVars),
 mxData(cov(ratings[,allMeasuredVars]), type = "cov", numObs = nrow(ratings))
 )
 bifactor= mxRun(bifactor)
 tmp = summary(bifactor)

If you print the summary (just type tmp now as we stored it there), you will see a fullsome summary, with standardized and unstandardized paths, std errors, and a list of fit indices.

Here are the fit indices formatted nicely: χ2(42) = 48.57, p = 0.2253; CFI = 0.999; TLI = 0.999; RMSEA = 0.013

Because this is R, we can also format these nicely. Let’s pull out the standardized paths, round them off

pathLoadings = tmp$parameters[,c("row", "col", "Std.Estimate")]

pathLoadings$Std.Estimate = round(pathLoadings$Std.Estimate,2)

pathLoadings


                row              col Std.Estimate
1          Courtesy   general_factor         0.78
2      Friendliness   general_factor         0.76
3       Helpfulness   general_factor         0.84
4           Service   general_factor         0.81
5      Seat_Comfort   general_factor         0.79
6    Seat_Roominess   general_factor         0.72
7  Overhead_Storage   general_factor         0.70
8    Clean_Aircraft   general_factor         0.74
9  Easy_Reservation   general_factor         0.75
10  Preferred_Seats   general_factor         0.70
11   Flight_Options   general_factor         0.62
12    Ticket_Prices   general_factor         0.64
13 Easy_Reservation        ticketing         0.31
14  Preferred_Seats        ticketing         0.40
15   Flight_Options        ticketing         0.39
16    Ticket_Prices        ticketing         0.44
17     Seat_Comfort         aircraft         0.30
18   Seat_Roominess         aircraft         0.37
19 Overhead_Storage         aircraft         0.44
20   Clean_Aircraft         aircraft         0.34
21         Courtesy          service         0.27
22     Friendliness          service         0.44
23      Helpfulness          service         0.27
24          Service          service         0.34
25         Courtesy         Courtesy         0.33
26     Friendliness     Friendliness         0.23
27      Helpfulness      Helpfulness         0.23
28          Service          Service         0.23
29     Seat_Comfort     Seat_Comfort         0.28
30   Seat_Roominess   Seat_Roominess         0.34
31 Overhead_Storage Overhead_Storage         0.32
32   Clean_Aircraft   Clean_Aircraft         0.34
33 Easy_Reservation Easy_Reservation         0.34
34  Preferred_Seats  Preferred_Seats         0.36
35   Flight_Options   Flight_Options         0.47
36    Ticket_Prices    Ticket_Prices         0.39
>

In my next post, I’ll show you how to standardize a model and then format the output more like you might expect for an EFA.

’til then, happy OpenMx-ing! tim

PS: If you want to learn more about fit indices, I recommend this Kenny link.

You probably want a figure as well, no? I describe doing that here.

R Code to Generate the Simulated Data and Run All Analyses

To generate the ratings data, execute this code below:

   
# First, we create a matrix of factor loadings.
# This pattern is called bifactor because it has a
# general factor with loadings from all the items
# and specific factors for separate components.
# The outcome variables are also formed as
# combinations of these general and specific factors.

loadings <- matrix(c(
 .33, .58, .00, .00,  # Ease of Making Reservation
 .35, .55, .00, .00,  # Availability of Preferred Seats
 .30, .52, .00, .00,  # Variety of Flight Options
 .40, .50, .00, .00,  # Ticket Prices
 .50, .00, .55, .00,  # Seat Comfort
 .41, .00, .51, .00,  # Roominess of Seat Area
 .45, .00, .57, .00,  # Availability of Overhead Storage
 .32, .00, .54, .00,  # Cleanliness of Aircraft
 .35, .00, .00, .50,  # Courtesy
 .38, .00, .00, .57,  # Friendliness
 .60, .00, .00, .50,  # Helpfulness
 .52, .00, .00, .58,  # Service
 .43, .10, .20, .30,  # Overall Satisfaction
 .35, .50, .40, .20,  # Purchase Intention
 .25, .50, .50, .00), # Willingness to Recommend
 nrow = 15, ncol = 4, byrow = TRUE
)

# Matrix multiplication produces the correlation matrix,
# except for the diagonal.
cor_matrix <- loadings %*% t(loadings)
# Diagonal set to ones.
diag(cor_matrix) <- 1
# cor_matrix = cov2cor(cor_matrix)
require(mvtnorm)
N = 1000
set.seed(7654321) #needed in order to reproduce the same data each time
std_ratings <- as.data.frame(rmvnorm(N, sigma = cor_matrix))

# Creates a mixture of two data sets:
# first 50 observations assigned uniformly lower scores.
ratings           <- data.frame(matrix(rep(0,15000),nrow=1000))
ratings[1:50,]    <- std_ratings[1:50,] * 2
ratings[51:1000,] <- std_ratings[51:1000,] * 2+7.0

# Ratings given different means
ratings[1] <- ratings[1]+2.2
ratings[2] <- ratings[2]+0.6
ratings[3] <- ratings[3]+0.3
ratings[4] <- ratings[4]+0.0
ratings[5] <- ratings[5]+1.5
ratings[6] <- ratings[6]+1.0
ratings[7] <- ratings[7]+0.5
ratings[8] <- ratings[8]+1.5
ratings[9] <- ratings[9]+2.4
ratings[10]<- ratings[10]+2.2
ratings[11]<- ratings[11]+2.1
ratings[12]<- ratings[12]+2.0
ratings[13]<- ratings[13]+1.5
ratings[14]<- ratings[14]+1.0
ratings[15]<- ratings[15]+0.5

# Truncates Scale to be between 1 and 9
ratings[ratings>9]<-9
ratings[ratings<1]<-1
# Rounds to single digit.
ratings<-round(ratings,0)

# Assigns names to the variables in the data frame called ratings
names(ratings) = c("Easy_Reservation","Preferred_Seats","Flight_Options","Ticket_Prices","Seat_Comfort","Seat_Roominess","Overhead_Storage","Clean_Aircraft","Courtesy","Friendliness","Helpfulness","Service","Satisfaction","Fly_Again","Recommend")

Now you have an object called ratings to work with.

To leave a comment for the author, please follow the link and comment on their blog: Industrial Code Workshop.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

From Lavaan to OpenMx

R Code to Generate the Simulated Data and Run All Analyses

Related

R Code to Generate the Simulated Data and Run All Analyses

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)