Making Friends with Multicollinearity
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Over a hundred years ago, Charles Spearman noted that performance scores from different cognitive tasks were highly correlated. Wikipedia provides a comprehensive review and a number of good examples of such correlation matrices. When looking at the recurring pattern of positive correlations among almost all cognitive tasks, Spearman saw the presence of a single latent ability dimension, which he called “g.” Spearman was not interested in running regression analyses with cognitive tasks as separate predictors. He was not concerned with the individual contribution of each cognitive task controlling for all the other cognitive tasks. He did not see multicollinearity as a problem but as an indication that each predictor was a manifestation of the same underlying latent trait. Spearman was inventing factor analysis and cared more about the latent trait than the manifest variables. Multicollinearity was a friend because it allowed Spearman to “see” behind the observed variables.
Item response theory follows Spearman’s lead. Test scores on cognitive tasks are replaced with individual items, but the focus remains on the latent trait responsible for the item score. In fact, items that do not measure the same latent trait in the same way across respondents will be removed (differential item functioning). In an earlier post, I attempted an intuitive introduction to item response theory. I plan to return to this topic in future posts. The positive manifold is a common structure underlying rating data (e.g., halo effects). My goal is to examine in some depth the cognitive and affective processes that are used when answering rating items and to show how the positive manifold results from such processes.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.