L is for Latent Variable Path Analysis

April 13, 2018
By

[This article was first published on Deeply Trivial, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

LVPAFor the letter F, I introduced the lavaan package with confirmatory factor analysis. You may have noticed, during my video on interpreting output that there are two functions for analysis: cfa and sem. When the model you specify is a confirmatory factor analysis, it doesn’t really matter which of these you use, because the results will be a CFA. But there are other models you can specify, which is where the sem function becomes useful.

One of those models is latent variable path analysis, or LVPA for short. This analysis technique combines path analysis, where you specify causal relationships between variables, and confirmatory factor analysis, where combinations of observed variables are used to measure a latent variable or factor. So LVPA allows you to specify which observed variables measure which factors, as well as causal relationships between those factors.

To demonstrate, I’ll conduct an analysis I demonstrated in my B is for Beta post; in that post, I used linear regression to demonstrate the predictive relationship between rumination and depression. I simply used total score from my rumination and depression measures, but I could have conducted an LVPA instead, allowing each of the items from these measures to load onto those factors. In fact, this analysis technique is very similar to regression, and is sometimes called “structural regression”.

Once again, we’ll load our Facebook dataset – as a reminder, you can access a simulated version of this dataset (along with a simple codebook). Then we’ll load the lavaan package.

Facebook<-read.delim(file="small_facebook_set.txt", header=TRUE)
library(lavaan)
## This is lavaan 0.5-23.1097
## lavaan is BETA software! Please report any bugs.

Next, we’ll create our models. For the factors, we’ll use the syntax we used previously, where we specify the name of the factor, =~, then the names of the variables loading onto that factor. To specify causal relationships between factors, the factor being caused goes first (endogenous latent variable), followed by the ~ symbol, then the causal factor or factors (exogenous latent variable(s)).

Rum_Dep<-'
Depression =~ Dep1 + Dep2 + Dep3 + Dep4 + Dep5 + Dep6 + Dep7 + Dep8 +
Dep9 + Dep10 + Dep11 + Dep12 + Dep13 + Dep14 + Dep15 + Dep16
Rumination =~ Rum1 + Rum2 + Rum3 + Rum4 + Rum5 + Rum6 + Rum7 + Rum8 + Rum9 +
Rum10 + Rum11 + Rum12 + Rum13 + Rum14 + Rum15 + Rum16 +
Rum17 + Rum18 + Rum19 + Rum20 + Rum21 + Rum22
Depression ~ Rumination
'

RD_Fit<-sem(Rum_Dep, data=Facebook)
summary(RD_Fit, standardized=TRUE, fit.measures=TRUE)
## lavaan (0.5-23.1097) converged normally after  41 iterations
##
## Number of observations 257
##
## Estimator ML
## Minimum Function Test Statistic 1689.041
## Degrees of freedom 664
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 5182.076
## Degrees of freedom 703
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.771
## Tucker-Lewis Index (TLI) 0.758
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -11766.888
## Loglikelihood unrestricted model (H1) -10922.367
##
## Number of free parameters 77
## Akaike (AIC) 23687.775
## Bayesian (BIC) 23961.054
## Sample-size adjusted Bayesian (BIC) 23716.941
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.078
## 90 Percent Confidence Interval 0.073 0.082
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.076
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Depression =~
## Dep1 1.000 0.451 0.583
## Dep2 0.806 0.123 6.565 0.000 0.364 0.469
## Dep3 1.395 0.156 8.944 0.000 0.630 0.706
## Dep4 0.784 0.149 5.243 0.000 0.354 0.361
## Dep5 0.799 0.142 5.619 0.000 0.361 0.390
## Dep6 1.614 0.160 10.084 0.000 0.729 0.856
## Dep7 0.901 0.145 6.204 0.000 0.407 0.438
## Dep8 1.207 0.140 8.601 0.000 0.545 0.667
## Dep9 0.734 0.126 5.817 0.000 0.332 0.406
## Dep10 1.401 0.157 8.902 0.000 0.633 0.701
## Dep11 0.849 0.116 7.292 0.000 0.383 0.534
## Dep12 1.106 0.142 7.818 0.000 0.499 0.585
## Dep13 0.826 0.113 7.324 0.000 0.373 0.537
## Dep14 1.420 0.142 9.976 0.000 0.641 0.840
## Dep15 1.161 0.140 8.270 0.000 0.524 0.631
## Dep16 0.980 0.136 7.194 0.000 0.442 0.525
## Rumination =~
## Rum1 1.000 0.609 0.596
## Rum2 0.834 0.120 6.967 0.000 0.508 0.492
## Rum3 0.787 0.118 6.642 0.000 0.479 0.465
## Rum4 0.899 0.120 7.492 0.000 0.548 0.538
## Rum5 1.071 0.141 7.624 0.000 0.653 0.550
## Rum6 1.100 0.133 8.283 0.000 0.671 0.612
## Rum7 1.158 0.140 8.301 0.000 0.706 0.613
## Rum8 1.133 0.136 8.341 0.000 0.691 0.617
## Rum9 1.043 0.130 8.040 0.000 0.635 0.588
## Rum10 1.145 0.134 8.526 0.000 0.698 0.635
## Rum11 1.055 0.134 7.885 0.000 0.643 0.574
## Rum12 0.564 0.115 4.891 0.000 0.343 0.329
## Rum13 0.788 0.108 7.282 0.000 0.480 0.519
## Rum14 1.137 0.128 8.872 0.000 0.693 0.671
## Rum15 1.282 0.143 8.968 0.000 0.781 0.681
## Rum16 1.363 0.142 9.581 0.000 0.830 0.748
## Rum17 1.271 0.136 9.356 0.000 0.775 0.723
## Rum18 1.256 0.137 9.170 0.000 0.765 0.702
## Rum19 1.168 0.129 9.069 0.000 0.711 0.692
## Rum20 1.338 0.142 9.404 0.000 0.815 0.728
## Rum21 0.978 0.130 7.531 0.000 0.596 0.542
## Rum22 1.248 0.136 9.153 0.000 0.760 0.701
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Depression ~
## Rumination 0.434 0.068 6.396 0.000 0.586 0.586
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Dep1 0.395 0.036 10.859 0.000 0.395 0.660
## .Dep2 0.469 0.042 11.076 0.000 0.469 0.780
## .Dep3 0.399 0.038 10.413 0.000 0.399 0.501
## .Dep4 0.836 0.075 11.198 0.000 0.836 0.870
## .Dep5 0.724 0.065 11.170 0.000 0.724 0.848
## .Dep6 0.193 0.022 8.777 0.000 0.193 0.267
## .Dep7 0.696 0.063 11.117 0.000 0.696 0.808
## .Dep8 0.370 0.035 10.593 0.000 0.370 0.555
## .Dep9 0.556 0.050 11.154 0.000 0.556 0.835
## .Dep10 0.413 0.040 10.438 0.000 0.413 0.508
## .Dep11 0.368 0.034 10.967 0.000 0.368 0.715
## .Dep12 0.480 0.044 10.856 0.000 0.480 0.658
## .Dep13 0.343 0.031 10.961 0.000 0.343 0.711
## .Dep14 0.171 0.019 9.098 0.000 0.171 0.294
## .Dep15 0.414 0.039 10.723 0.000 0.414 0.601
## .Dep16 0.514 0.047 10.984 0.000 0.514 0.724
## .Rum1 0.675 0.062 10.917 0.000 0.675 0.645
## .Rum2 0.807 0.073 11.092 0.000 0.807 0.758
## .Rum3 0.833 0.075 11.126 0.000 0.833 0.784
## .Rum4 0.737 0.067 11.026 0.000 0.737 0.710
## .Rum5 0.983 0.089 11.006 0.000 0.983 0.698
## .Rum6 0.752 0.069 10.881 0.000 0.752 0.626
## .Rum7 0.826 0.076 10.877 0.000 0.826 0.624
## .Rum8 0.775 0.071 10.867 0.000 0.775 0.619
## .Rum9 0.763 0.070 10.933 0.000 0.763 0.654
## .Rum10 0.718 0.066 10.820 0.000 0.718 0.596
## .Rum11 0.842 0.077 10.962 0.000 0.842 0.671
## .Rum12 0.971 0.086 11.243 0.000 0.971 0.892
## .Rum13 0.624 0.056 11.054 0.000 0.624 0.730
## .Rum14 0.587 0.055 10.713 0.000 0.587 0.550
## .Rum15 0.706 0.066 10.677 0.000 0.706 0.536
## .Rum16 0.542 0.052 10.366 0.000 0.542 0.440
## .Rum17 0.548 0.052 10.502 0.000 0.548 0.478
## .Rum18 0.601 0.057 10.594 0.000 0.601 0.507
## .Rum19 0.552 0.052 10.637 0.000 0.552 0.522
## .Rum20 0.589 0.056 10.475 0.000 0.589 0.470
## .Rum21 0.855 0.078 11.020 0.000 0.855 0.707
## .Rum22 0.600 0.057 10.601 0.000 0.600 0.509
## .Depression 0.134 0.027 4.880 0.000 0.657 0.657
## Rumination 0.371 0.072 5.123 0.000 1.000 1.000

The factor analysis results are interpreted in the same way as before. The only difference is that we also have a path coefficient between rumination and depression, which describes the numerical strength of the relationship between the two variables. These results confirm our regression results, that rumination has a strong predictive relationship with depression, which can see from our standardized path coefficient, 0.586. But the model also shows some signs of poor fit, based on our CFI and TLI, both less than 0.9, as well as a slightly elevated RMSEA (greater than our cutoff of 0.07).

How could we potentially improve this model? In the Beta post, we also conducted a separate regression where we broke Rumination down into its 3 subscales: Depression-Related Rumination, Brooding, and Reflecting. We could try conducting another LVPA where we use those 3 subscales, instead a single factor of Rumination.

So let’s create another model including 4 factors: Depression (using the CESD items), Depression-Related Rumination, Brooding, and Reflecting. We’ll then add causal paths between these 3 Rumination constructs and Depression.

Rum3_Dep<-'
Depression =~ Dep1 + Dep2 + Dep3 + Dep4 + Dep5 + Dep6 + Dep7 + Dep8 +
Dep9 + Dep10 + Dep11 + Dep12 + Dep13 + Dep14 + Dep15 + Dep16
DRR =~ Rum1 + Rum2 + Rum3 + Rum4 + Rum6 + Rum8 + Rum9 + Rum14 + Rum17 + Rum18 +
Rum19 + Rum22
Reflecting =~ Rum7 + Rum11 + Rum12 + Rum20 + Rum21
Brooding =~ Rum5 + Rum10 + Rum13 + Rum15 + Rum16
Depression ~ DRR + Reflecting + Brooding
'

RD3<-sem(Rum3_Dep, data=Facebook)
summary(RD3, standardized=TRUE, fit.measures=TRUE)
## lavaan (0.5-23.1097) converged normally after  61 iterations
##
## Number of observations 257
##
## Estimator ML
## Minimum Function Test Statistic 1566.635
## Degrees of freedom 659
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 5182.076
## Degrees of freedom 703
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.797
## Tucker-Lewis Index (TLI) 0.784
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -11705.685
## Loglikelihood unrestricted model (H1) -10922.367
##
## Number of free parameters 82
## Akaike (AIC) 23575.369
## Bayesian (BIC) 23866.394
## Sample-size adjusted Bayesian (BIC) 23606.429
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.073
## 90 Percent Confidence Interval 0.069 0.078
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.073
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Depression =~
## Dep1 1.000 0.451 0.583
## Dep2 0.807 0.123 6.569 0.000 0.364 0.469
## Dep3 1.396 0.156 8.932 0.000 0.629 0.706
## Dep4 0.783 0.150 5.234 0.000 0.353 0.360
## Dep5 0.809 0.143 5.670 0.000 0.365 0.395
## Dep6 1.616 0.160 10.075 0.000 0.729 0.856
## Dep7 0.906 0.146 6.223 0.000 0.408 0.440
## Dep8 1.206 0.141 8.584 0.000 0.544 0.666
## Dep9 0.735 0.126 5.815 0.000 0.331 0.406
## Dep10 1.405 0.158 8.903 0.000 0.633 0.702
## Dep11 0.846 0.116 7.263 0.000 0.381 0.532
## Dep12 1.108 0.142 7.814 0.000 0.500 0.585
## Dep13 0.825 0.113 7.307 0.000 0.372 0.536
## Dep14 1.421 0.143 9.964 0.000 0.641 0.840
## Dep15 1.159 0.141 8.248 0.000 0.523 0.630
## Dep16 0.986 0.137 7.221 0.000 0.445 0.528
## DRR =~
## Rum1 1.000 0.617 0.603
## Rum2 0.833 0.118 7.036 0.000 0.514 0.498
## Rum3 0.812 0.118 6.898 0.000 0.501 0.486
## Rum4 0.928 0.119 7.779 0.000 0.573 0.562
## Rum6 1.117 0.132 8.486 0.000 0.689 0.628
## Rum8 1.135 0.134 8.461 0.000 0.700 0.626
## Rum9 1.052 0.128 8.201 0.000 0.649 0.601
## Rum14 1.132 0.126 8.971 0.000 0.699 0.676
## Rum17 1.238 0.133 9.319 0.000 0.764 0.713
## Rum18 1.236 0.134 9.199 0.000 0.763 0.700
## Rum19 1.174 0.127 9.234 0.000 0.724 0.704
## Rum22 1.238 0.134 9.236 0.000 0.764 0.704
## Reflecting =~
## Rum7 1.000 0.841 0.731
## Rum11 0.907 0.089 10.149 0.000 0.763 0.681
## Rum12 0.550 0.083 6.609 0.000 0.462 0.443
## Rum20 1.073 0.090 11.856 0.000 0.903 0.806
## Rum21 0.872 0.088 9.937 0.000 0.734 0.667
## Brooding =~
## Rum5 1.000 0.676 0.570
## Rum10 1.086 0.132 8.229 0.000 0.734 0.669
## Rum13 0.705 0.103 6.835 0.000 0.477 0.516
## Rum15 1.229 0.142 8.659 0.000 0.831 0.725
## Rum16 1.332 0.144 9.243 0.000 0.901 0.812
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Depression ~
## DRR 0.748 0.245 3.050 0.002 1.024 1.024
## Reflecting -0.067 0.068 -0.975 0.329 -0.124 -0.124
## Brooding -0.223 0.188 -1.186 0.235 -0.335 -0.335
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## DRR ~~
## Reflecting 0.413 0.062 6.691 0.000 0.796 0.796
## Brooding 0.386 0.061 6.288 0.000 0.924 0.924
## Reflecting ~~
## Brooding 0.420 0.068 6.214 0.000 0.738 0.738
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Dep1 0.396 0.036 10.862 0.000 0.396 0.660
## .Dep2 0.469 0.042 11.076 0.000 0.469 0.780
## .Dep3 0.399 0.038 10.419 0.000 0.399 0.502
## .Dep4 0.836 0.075 11.199 0.000 0.836 0.870
## .Dep5 0.721 0.065 11.166 0.000 0.721 0.844
## .Dep6 0.193 0.022 8.779 0.000 0.193 0.267
## .Dep7 0.695 0.063 11.115 0.000 0.695 0.807
## .Dep8 0.371 0.035 10.600 0.000 0.371 0.556
## .Dep9 0.556 0.050 11.154 0.000 0.556 0.835
## .Dep10 0.412 0.039 10.436 0.000 0.412 0.507
## .Dep11 0.369 0.034 10.973 0.000 0.369 0.717
## .Dep12 0.480 0.044 10.857 0.000 0.480 0.658
## .Dep13 0.344 0.031 10.965 0.000 0.344 0.713
## .Dep14 0.171 0.019 9.108 0.000 0.171 0.295
## .Dep15 0.416 0.039 10.730 0.000 0.416 0.604
## .Dep16 0.512 0.047 10.980 0.000 0.512 0.721
## .Rum1 0.666 0.062 10.824 0.000 0.666 0.636
## .Rum2 0.802 0.073 11.042 0.000 0.802 0.752
## .Rum3 0.812 0.073 11.060 0.000 0.812 0.764
## .Rum4 0.709 0.065 10.922 0.000 0.709 0.684
## .Rum6 0.728 0.068 10.751 0.000 0.728 0.605
## .Rum8 0.762 0.071 10.758 0.000 0.762 0.608
## .Rum9 0.745 0.069 10.829 0.000 0.745 0.639
## .Rum14 0.578 0.055 10.577 0.000 0.578 0.542
## .Rum17 0.565 0.054 10.404 0.000 0.565 0.492
## .Rum18 0.605 0.058 10.470 0.000 0.605 0.510
## .Rum19 0.534 0.051 10.451 0.000 0.534 0.505
## .Rum22 0.594 0.057 10.451 0.000 0.594 0.504
## .Rum7 0.616 0.067 9.208 0.000 0.616 0.465
## .Rum11 0.673 0.069 9.743 0.000 0.673 0.536
## .Rum12 0.876 0.080 10.894 0.000 0.876 0.804
## .Rum20 0.439 0.056 7.885 0.000 0.439 0.350
## .Rum21 0.672 0.068 9.866 0.000 0.672 0.555
## .Rum5 0.952 0.089 10.652 0.000 0.952 0.675
## .Rum10 0.665 0.065 10.167 0.000 0.665 0.552
## .Rum13 0.627 0.058 10.822 0.000 0.627 0.734
## .Rum15 0.624 0.064 9.720 0.000 0.624 0.475
## .Rum16 0.420 0.050 8.399 0.000 0.420 0.341
## .Depression 0.122 0.026 4.626 0.000 0.599 0.599
## DRR 0.380 0.074 5.176 0.000 1.000 1.000
## Reflecting 0.708 0.111 6.406 0.000 1.000 1.000
## Brooding 0.457 0.097 4.734 0.000 1.000 1.000

Fit measures are about the same as before – once again, this could be because we’re measuring clinical constructs in a non-clinical sample – but let’s skip those and look at our path coefficients. As we found in the regression, Depression-Related Rumination has a significant relationship with Depression; Reflecting and Brooding do not. So we could simplify our model by dropping those path coefficients – if we wanted. Personally, I would leave them in as further evidence that the kind of rumination most strongly related to depression is rumination that fixates on one’s negative traits and feelings. Reflecting on feelings, or being morose in general, don’t seem to contribute – at least not above and beyond the first kind of rumination.

On Sunday, we’ll dig into fit measures – how they’re calculated and what they mean – so check back then! And tomorrow in A to Z, R Markdown Files, which I’ve been using this month to create most of my posts.

To leave a comment for the author, please follow the link and comment on their blog: Deeply Trivial.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)