This post discusses ways of adjusting correlations for reliability.
Classical Test Theory states that an Observed variable is True Score plus Error. The true score variable is latent. In psychology theoretical interest typically relates more to the latent than the observed variable. How can you estimate the correlation between two latent variables?
The correction for attenuation formula:
- rxy / sqrt(rxx * ryy)
- Or in words: The disattenuated correlation is the raw correlation between x and y (rxy) divided by the square root of the product of the reliability of x (rxx) and the reliability of y (ryy).
- See Page 130 of Murphy, K. R. & Davidshofer, C. O. (1988). Psychological Testing: Principles and Applications.
- Here it is on Wikipedia
The psych package has the following function which will return a correlation matrix of corrected correlations. For the details see the help.
“Raw correlations below the diagonal, reliabilities on the diagonal, disattenuated above the diagonal.”
Structural Equation Modelling:
A major motivation for doing Structural Equation Modelling is to estimate parameters (e.g., correlations and regression coefficients) after adjusting for reliability of measurement. You can either specify the reliability of measurement explicitly or you can estimate the reliability based on the indicators used.
It can sometimes be nice to show a correlation matrix with reliability adjusted correlations in the upper diagonal and unadjusted correlations in the lower diagonal. The correct.cor function in the psych package provides this output.
Comments on Assessing Variable Importance in Multiple Regression:
If you are trying to assess the relative importance of a set of predictors in a multiple regression, it is problematic if the predictors differ in their reliability. The predictors with larger reliability will appear better than other predictors partially because of differences in reliability.
In this situation, it is desirable to design a study where all measures are reliable and equally so. SEM provides a good option if the data is already collected and the measures differ in reliability.