R-squared for multilevel models

[This article was first published on Statistical Modeling, Causal Inference, and Social Science » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Fred Schiff writes:

I’m writing to you to ask about the “R-squared” approximation procedure you suggest in your 2004 book with Dr. Hill. [See also this paper with Pardoe—ed.]

I’m a media sociologist at the University of Houston. I’ve been using HLM3 for about two years.

Briefly about my data. It’s a content analysis of news stories with a continuous scale dependent variable, story prominence. I have 6090 news stories, 114 newspapers, and 59 newspaper group owners. All the Level-1, Level-2 and dependent variables have been standardized. Since the means were zero anyway, we left the variables uncentered. All the Level-3 ownership groups and characteristics are dichotomous scales that were left uncentered.

PROBLEM: The single most important result I am looking for is to compare the strength of nine competing Level-1 variables in their ability to predict and explain the outcome variable, story prominence. We are trying to use the residuals to calculate a “R-squared” measure for each level as you and Hill proposed. We haven’t been able to generate OLS regression equations for each newspaper and ownership group in HLM because the manual suggests “optional settings” that are not available in our software (HLM 6.06).
QUESTION-1 – How could we generate the estimated Bayesian residuals for level-1?

QUESTION-2 – Is it legitimate to run a model where Level-1 and Level-2 variables are standardized and Level-3 variables are dichotomous dummy variables?

QUESTION-3 – Is it legitimate to run models to estimate parameters for each ownership group and at the same time include the corresponding dummy variables as part of the data structure?

QUESTION-4 – In equations that include Level-3 variables, is it valid to describe the results as applying selectively to the stories (L1) in newspapers (L2) owned by one ownership group (L3, coded 1) as opposed to stories in newspapers of other ownership groups (L3, coded 0)?

My reply:

1. I don’t know the HLM software so I don’t know how to use it to compute the Bayesian residuals. But you might be happy to hear that we are currently working on implementing these ideas using the lmer/glmer software in R. Once it’s been programmed in one package, it shouldn’t be hard for people to translate it into another.

2. Yes, this is fine. When in doubt, interpret coefficients by considering predictions with inputs set to various reasonable fixed values.

3. I don’t quite understand this question. If you have all the data loaded in, you should be able to use ownership group as a level and also include predictors at that level.

4. I think this is reasonable but I’m not following all the details. Again, when in doubt, it’s always a good idea to understand your model through comparisons of specific predictions. That’s one trick we use in our book on occasion.

The post R-squared for multilevel models appeared first on Statistical Modeling, Causal Inference, and Social Science.

To leave a comment for the author, please follow the link and comment on their blog: Statistical Modeling, Causal Inference, and Social Science » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)