Meta-analysis methods when studies are not normally distributed

June 10, 2014

(This article was first published on Robert Grant's stats blog » R, and kindly contributed to R-bloggers)

Yesterday I was reading Kontopantelis & Reeves’s 2010 paper “Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A simulation study“, which compares fixed-effects and a variety of random effects models under the (entirely realistic) situation where the studies do not happen to be drawn from a normal distribution. In theory, they would be if they were just perturbed from a global mean by sampling error, and that leads us to the fixed effects model, but the random effects says that there’s other stuff making the inter-study variation even bigger, and the trouble is that by definition you don’t know what that ‘stuff’ is (or you have modelled it – wouldn’t you???)

The random effects options they consider (with my irreverent nutshell descriptions in brackets; don’t write and tell me they are over-simplifications, that’s the point) are DerSimonian-Laird (pretend we know the intra-study SD), Biggerstaff-Tweedie (intra-study SDs are themselves drawn from an inverse-chi-squared distribution), Sidik-Jonkman (account for unknown global SD by drawing the means from a t-distribution), “Q-based” (test Cochran’s Q for heterogeneity, if significant, use D-L, if not, use FE), maximum likelihood for both mean and SD, profile likelihood for mean under unknown SD, and a permutation version of D-L which was proposed by Follmann & Proschan, and seeing as everyone has been immortalized in the meta-analysis Hall of Infamy, I’m going to do it to them too. All in all a pretty exhaustive list. They tested them out in 10,000 simulations with data from a variety of skewed and leptokurtic population distributions, and different numbers of studies.

Conclusion one is that they are all pretty robust to all but the most bizarre deviations from normality. Having said that, the authors offer a graph dividing the world into D-L optimal and profile likelihood optimal, which I found a bit odd because, firstly, DerSimonian-Laird is never true, it’s just an approximation, secondly, profile likelihood required bespoke, painful programming each time and thirdly, they just said it doesn’t matter. I rather like the look of Sidak-Jonkman in the tables of results, but that may be a cognitive bias in me that prefers old-skool solutions along the lines of “just log everything and do a t-test, it’ll be about right and then you can go home early” (a strange attitude in one who spends a lot of time doing Bayesian structural equation models). I also like Follmann-Proschan for their auto-correcting permutations, but if a non-permuting method can give me a decent answer, why bother?

Interestingly, the authors have provided all these methods in an Excel plug-in (I can’t recommend that, but on your head be it) and the Stata package metaan, which I shall be looking into next time I have to smash studies together. In R, you can get profile likelihood (and, I think, Biggerstaff-Tweedie) from the metaLik package, and maximum likelihood, Sidik-Jonkman and a REML estimator too from metafor. Simpler options are in rmeta and some more esoteric ones in meta. However, it still seems to me that the most important thing to do is to look at the individual study effects and try to work out what shape they follow and what factors in the study design and execution could have put them there. This could provide the reader with much richer information than just one mega-result (oops sorry, inadvertently strayed a little too close to Eysenck there) that sweeps the cause of heterogeneity under the carpet.

To leave a comment for the author, please follow the link and comment on their blog: Robert Grant's stats blog » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)