R-squared for multilevel models

January 15, 2012
By

(This article was first published on Statistical Modeling, Causal Inference, and Social Science » R, and kindly contributed to R-bloggers)

Fred Schiff writes:

I’m writing to you to ask about the “R-squared” approximation procedure you suggest in your 2004 book with Dr. Hill. [See also this paper with Pardoe---ed.]

I’m a media sociologist at the University of Houston. I’ve been using HLM3 for about two years.

Briefly about my data. It’s a content analysis of news stories with a continuous scale dependent variable, story prominence. I have 6090 news stories, 114 newspapers, and 59 newspaper group owners. All the Level-1, Level-2 and dependent variables have been standardized. Since the means were zero anyway, we left the variables uncentered. All the Level-3 ownership groups and characteristics are dichotomous scales that were left uncentered.

PROBLEM: The single most important result I am looking for is to compare the strength of nine competing Level-1 variables in their ability to predict and explain the outcome variable, story prominence. We are trying to use the residuals to calculate a “R-squared” measure for each level as you and Hill proposed. We haven’t been able to generate OLS regression equations for each newspaper and ownership group in HLM because the manual suggests “optional settings” that are not available in our software (HLM 6.06).
QUESTION-1 – How could we generate the estimated Bayesian residuals for level-1?

QUESTION-2 – Is it legitimate to run a model where Level-1 and Level-2 variables are standardized and Level-3 variables are dichotomous dummy variables?

QUESTION-3 – Is it legitimate to run models to estimate parameters for each ownership group and at the same time include the corresponding dummy variables as part of the data structure?

QUESTION-4 – In equations that include Level-3 variables, is it valid to describe the results as applying selectively to the stories (L1) in newspapers (L2) owned by one ownership group (L3, coded 1) as opposed to stories in newspapers of other ownership groups (L3, coded 0)?

My reply:

1. I don’t know the HLM software so I don’t know how to use it to compute the Bayesian residuals. But you might be happy to hear that we are currently working on implementing these ideas using the lmer/glmer software in R. Once it’s been programmed in one package, it shouldn’t be hard for people to translate it into another.

2. Yes, this is fine. When in doubt, interpret coefficients by considering predictions with inputs set to various reasonable fixed values.

3. I don’t quite understand this question. If you have all the data loaded in, you should be able to use ownership group as a level and also include predictors at that level.

4. I think this is reasonable but I’m not following all the details. Again, when in doubt, it’s always a good idea to understand your model through comparisons of specific predictions. That’s one trick we use in our book on occasion.

The post R-squared for multilevel models appeared first on Statistical Modeling, Causal Inference, and Social Science.

To leave a comment for the author, please follow the link and comment on his blog: Statistical Modeling, Causal Inference, and Social Science » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.