R and Meta-Analysis

July 17, 2014

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Joseph Rickert

Broadly speaking, a meta-analysis is any statistical analysis that attempts to combine the results of several individual studies.  The term was apparently coined by statistician Gene V Glass in a 1976 speech he made to the American Education Research Association. Since that time, not only has meta-analysis become a fundamental tool in medicine, but it is also becoming popular in economics, finance, the social sciences and engineering. Organizations responsible for setting standards for evidence-based medicine such as the United Kingdom’s National Institute for Health and Care Excellence (NICE) make extensive use of meta-analysis.

The application of meta-analysis to medicine is intuitive and, on the surface, compelling. Clinical trials designed to test efficacy for some new treatment for a disease against the standard treatment tend to be based on relatively small samples. (For example, the largest four trials for Respiratory Tract Diseases currently listed on ClinicalTrials.gov has an estimated enrollment of 533 patients.) It would seem to be a “no brainer” to use “all of the information” to get more accurate results. However, as for so many things, the devil is in the details. The preliminary tasks of establishing a rigorous protocol for guiding the meta-analysis and the systematic review to search for relevant studies are themselves far from trivial. One has to work hard to avoid “selection bias”, “publication bias” and other even more subtle difficulties.

In my limited experience with meta-analysis, I found it extraordinariy difficult to determine whether patient populations from different clinical trials were sufficiently homogenous to be included in the same meta-analysis. Even when working with well-written papers, published in quality journals, a considerable amount of medical expertise was required to interpret the data. I came away with the strong impression that a good meta-analysis requires collaboration from a team of experts.

Historically, it has probably been the case that most meta-analyses were conducted either with general tools such as Excel or specialized software like RevMan from the Cochrane Collaboration. However, R is the natural platform for meta-analysis both because of the myriad possibilities for statistical analyses that are not generally available through the specialized software, and because of the many packages devoted to various aspects of meta-analysis. The CRAN Meta Analysis Task View is exceptionally well-organized listing R packages according to the different stages of conducting a meta-analysis and also calling out some specialized techniques such as meta-regression and network-meta analysis.

ln a future post, I hope to be able to explore some of these packages more closely. For now, let’s look at a very simple analysis based on Thomas Lumley’s rmeta package which has been a part of R since 1999. The following simple meta-analysis is written up very nicely in the book by Chen and Peace titled Applied Meta-Analysis with R.

The cochrane data set in the rmeta package contains the results from seven randomized clinical trials designed to test the effectiveness of corticosteriod therapy in preventing neonatal deaths in premature labor. The columns of the data set are: the name of the trial center, the number of deaths in the treatment group, the total number of patients in the treatment group, the number of deaths in the control group and the total number of patients in the control group.

# Simple Meta-analysis
          name  ev.trt n.trt ev.ctrl n.ctrl
1     Auckland     36   532      60    538
2        Block      1    69       5     61
3        Doran      4    81      11     63
4        Gamsu     14   131      20    137
5     Morrison      3    67       7     59
6 Papageorgiou      1    71       7     75
7      Tauesch      8    56      10      7

The null hypothesis is that there is no difference between treatment and control. Following Chen and Peace, we fit both fixed effects and random effects models to look at the odds ratios.

model.FE <- meta.MH(n.trt,n.ctrl,ev.trt,ev.ctrl, names=name,data=cochrane)
model.RE <- meta.DSL(n.trt,n.ctrl,ev.trt,ev.ctrl, names=name,data=cochrane)

The summary for the fixed effects models shows that while only two studies, Auckland and Doran, individually show a significant effect, the overall confidence interval from the Mantel Haenszel test does indicate a benefit from the treatment.

Fixed effects ( Mantel-Haenszel ) meta-analysis
Call: meta.MH(ntrt = n.trt, nctrl = n.ctrl, ptrt = ev.trt, pctrl = ev.ctrl, 
    names = name, data = cochrane)
               OR (lower  95% upper)
Auckland     0.58    0.38       0.89
Block        0.16    0.02       1.45
Doran        0.25    0.07       0.81
Gamsu        0.70    0.34       1.45
Morrison     0.35    0.09       1.41
Papageorgiou 0.14    0.02       1.16
Tauesch      1.02    0.37       2.77
Mantel-Haenszel OR =0.53 95% CI ( 0.39,0.73 )
Test for heterogeneity: X^2( 6 ) = 6.9 ( p-value 0.3303 )

The summary for the random effects model for this data is identical except, as one would expect, the overall confidence interval is somewhat wider: SummaryOR= 0.53  95% CI ( 0.37,0.78 ). A slight modification to enhanced the forest plot code provided by Chen and Peace (which works for both the fixed effects and random effects model objects) shows the typical way to present these results.

CPplot <- function(model){
  c1 <- c("","Study",model$names,NA,"Summary")
  c2 <- c("Deaths","(Steroid)",cochrane$ev.trt,NA,NA)
  c3 <- c("Deaths","(Placebo)",cochrane$ev.ctrl,NA,NA)
  c4 <- c("","OR",format(exp(model[[1]]),digits=2),NA,format(exp(model[[3]]),digits=2))
  tableText <-cbind(c1,c2,c3,c4)
  mean   <- c(NA,NA,model[[1]],NA,model[[3]])
  stderr <- c(NA,NA,model[[2]],NA,model[[4]])
  low <- mean - 1.96*stderr
  up <- mean + 1.96*stderr



The whole idea of meta-analysis is intriguing. However, because of the challenges I mentioned above, I would be remiss not to point out that it elicits considerable criticism. The article Meta-analysis and its problems by H J Eysenck captures the issues and is well worth reading. Also, have a look at the review article by Walker, Hernandez and Kattan writing in the Cleveland Clinic Journal of Medicine.


To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)