# GEE QIC update

November 15, 2012
By

(This article was first published on Daniel J. Hocking » Blog, and kindly contributed to R-bloggers)

Here is improved code for calculating QIC from geeglm in geepack in R (original post). Let me know how it works. I haven’t tested it much, but is seems that QIC may select overparameterized models. In the code below, I had to replace <- with = because wordpress didn’t except <- within code or pre tags. It should still work just fine.

Here is a quick example of how to run this function. First, highlight and run the code below in R. That will save the function in your workspace. Then run your gee model using geeglm in geepack (package available from CRAN). Next, run QIC(your_gee_model) and you get the QIC. You can then repeat this with alternative a priori models. Below the function is an example using data available as part of geepack.

```
############################################################
# QIC for GEE models
# Daniel J. Hocking
############################################################
QIC = function(model.R) {
library(MASS)
model.indep = update(model.R, corstr = "independence")
# Quasilikelihood
mu.R = model.R\$fitted.values
y = model.R\$y
type = family(model.R)\$family
quasi.R = switch(type,
poisson = sum((y*log(mu.R)) - mu.R),
gaussian = sum(((y - mu.R)^2)/-2),
binomial = sum(y*log(mu.R/(1 - mu.R)) + log(1 - mu.R)),
Gamma = sum(-y/(mu.R - log(mu.R))),
stop("Error: distribution not recognized"))
# Trace Term (penalty for model complexity)
omegaI = ginv(model.indep\$geese\$vbeta.naiv) # Omega-hat(I) via Moore-Penrose generalized inverse of a matrix in MASS package
#AIinverse = solve(model.indep\$geese\$vbeta.naiv) # solve via indenity
Vr = model.R\$geese\$vbeta
trace.R = sum(diag(omegaI %*% Vr))
px = length(mu.R) # number non-redunant columns in design matrix
# QIC
QIC = 2*(trace.R - quasi.R) [EDIT: original post was missing '*']
#QICu = (-2)*quasi.R + 2*px    # Approximation assuming model structured correctly
output = c(QIC, quasi.R, trace.R, px)
names(output) = c('QIC', 'Quasi Lik', 'Trace', 'px')
return(output)
}```

Here’s an example you can run in R.

```library(geepack)

data(dietox)
dietox\$Cu = as.factor(dietox\$Cu)
mf = formula(Weight ~ Cu * (Time + I(Time^2) + I(Time^3)))
gee1 = geeglm(mf, data = dietox, id = Pig, family = poisson, corstr = "ar1")
gee1
summary(gee1)

mf2 = formula(Weight ~ Cu * Time + I(Time^2) + I(Time^3))
gee2 = geeglm(mf2, data = dietox, id = Pig, family = poisson, corstr = "ar1")
summary(gee2)
anova(gee2)
anova(gee1, gee2)

mf3 = formula(Weight ~ Cu + Time + I(Time^2))
gee3 = geeglm(mf3, data = dietox, id = Pig, family = poisson, corstr = "ar1")
gee3.I = update(gee3, corstr = "independence")
gee3.Ex = update(gee3, corstr = "exchangeable")

sapply(list(gee1, gee2, gee3, gee3.I, gee3.Ex), QIC)```

In the output of this model it suggests that model gee1 is the best model. I have some concerns that QIC will almost inevitably choose the most complex model. More testing with simulated data will be necessary.

```               [,1]      [,2]      [,3]      [,4]      [,5]
QIC       -333199.7 -333188.0 -333187.5 -333181.8 -333153.6
Quasi Lik  166623.0  166622.7  166620.4  166622.2  166615.4
Trace          23.2      28.7      26.6      31.3      38.6
px            861.0     861.0     861.0     861.0     861.0```

You will get warnings when running this model because it uses a Poisson distribution for continuous data. I will work on finding a better example in the future before I make this available as an R package.

Filed under: GEE, Modeling, R Tagged: GEE, generalized estimating equations, QIC, R, regression

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...