Global Temperature Proxy Reconstructions ~ Bayesian extrapolation of warming w/ rjags

Posted on August 22, 2010 by apeescape in R bloggers | 0 Comments

[This article was first published on mind of a Markov chain » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Update: fixed projection.

There are a bunch of “hockey sticks” that calculate past global temps. through the use of proxies when instrumental data is absent.

There is a new one out there by McShane and Wyner (2010) that’s creating quite a stir in the blogosphere (here, here, here, here). The main take out being, that the uncertainty is too great for the proxies to be any good.

I thought it would be a fun exercise to recreate their analysis (code and data @ McShane’s website for now; @ Annals of Applied Statistics in the future). Broadly speaking, the paper has two parts: constructing PCAs to come up with the predictors, and the Bayesian analysis to construct uncertainty bounds). I need to refreshen my Bayesian instincts anyways. I might do the PCAs, but because I lack knowledge in this area, it’ll probably take a little more time.

The general model is as follows (assuming all the PCAs are done):

$textnormal{temp}_t = beta_0 + beta_1textnormal{temp}_{t-1} + beta_2textnormal{temp}_{t-2} + sum_{i=3}^{12} beta_i(textnormal{PC})_{t,i} + sigma^2_t$

The following are the uninformative priors:

$vec{beta} sim N(vec{0}, 1000 I); sigma sim Unif(0,100)$

This is basically an AR(2) model with principal components as covariates. Thanks to the Bayes Theorem, we can put uncertainty bounds around the parameters in addition to the uncertainty around the data (residuals).

Now, to estimate this, I decided to use JAGS. This is basically the command-line version of WinBUGS, a program that uses the Gibbs sampler to generate values from the posterior distribution. I was thinking of using PyMC (a Python package), but because I had so much problems with it, I resorted to rjags (WinBUGS not an option!).

For a casual intro, see Simon Jackson’s or John Myles White’s blog.

What’s nice about rjags (as opposed to using JAGS itself) is the R interface. The only thing outside of R that you need to do, is to specify the model (and save it as a .bug file). The data, parameter, initial value specification, grabbing the results, inference, all that could be done through the R interpreter.

Download the M&W2010 data and code, and you can find the .ocd file (WinBUGS). This file has the model designation, so extract that, and put it into a .bug file:

model
{
  for(i in 1:nrows){
    mu[i] <- beta[1] + beta[2]*Lagy[i,1] + beta[3]*Lagy[i,2] +
      inprod(beta[4:13], PC[i,1:10])
    y[i] ~ dnorm( mu[i], precsigma )
  }

  ## Priors:
  ## precsigma ~ dgamma(0.001, 0.001)
  ## sigma <- 1/sqrt(precsigma2)
  sigma ~ dunif(0,100)
  precsigma <- 1/pow(sigma, 2)
  beta[1:13] ~ dmnorm( beta0[1:13], precbeta[1:13, 1:13] )
}

Everything else, we put in the R code (click below to expand).

## from McShane & Wyner (2010)

library(rjags)

# current directory
d <- getwd()

## read in data .. get rid of "END" and the zeros at the end of the file
mcshane.data <- read.table(paste(d, "/../bayesmodel/BUGS_data.txt", sep = ""), head = T)
names(mcshane.data) <- c("y[]", "Lagy[,1]", "Lagy[,2]",	"PC[,1]", "PC[,2]", "PC[,3]", "PC[,4]",	"PC[,5]", "PC[,6]", "PC[,7]", "PC[,8]", "PC[,9]", "PC[,10]")

# recreate the data
# don't really have to do this, but hey
write.table(mcshane.data, "mcshane.data", quote = F)

## fixed parameters to for the priors
nrows <- 149
beta0 <- c(0,0,0,0,0,0,0,0,0,0,0,0,0)
precbeta <- structure(.Data=c(
                        0.001, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                        0, 0.001, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                        0, 0, 0.001, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                        0, 0, 0, 0.001, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                        0, 0, 0, 0, 0.001, 0, 0, 0, 0, 0, 0, 0, 0,
                        0, 0, 0, 0, 0, 0.001, 0, 0, 0, 0, 0, 0, 0,
                        0, 0, 0, 0, 0, 0, 0.001, 0, 0, 0, 0, 0, 0,
                        0, 0, 0, 0, 0, 0, 0, 0.001, 0, 0, 0, 0, 0,
                        0, 0, 0, 0, 0, 0, 0, 0, 0.001, 0, 0, 0, 0,
                        0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001, 0, 0, 0,
                        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001, 0, 0,
                        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001, 0,
                        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001
                        ), .Dim=c(13,13))

y <- mcshane.data[,grep("^y", names(mcshane.data))]
Lagy <- as.matrix(mcshane.data[,grep("Lagy", names(mcshane.data))])
PC <- as.matrix(mcshane.data[,grep("PC", names(mcshane.data))])

mcshane.all <- list(y = y, Lagy = Lagy, PC = PC,
                    nrows = nrows, beta0 = beta0, precbeta = precbeta)

mcshane.inits <- list(precsigma2 = 1,
                      beta=c(0,0,0,0,0,0,0,0,0,0,0,0,0))

mcshane.jags <- jags.model(file = paste(d, "/mcshanejags.bug", sep = ""),
                           data = mcshane.all,
                           inits = mcshane.inits,
                           n.chain = 2)

# burn in
update(mcshane.jags, 1000)

# monitor
mcshane.out <- jags.samples(mcshane.jags, c("beta", "mu", "precsigma"),
                            n.iter = 1000, thin = 10)

names(mcshane.out)
##[1] "beta"      "mu"        "precsigma"

# make it readable to coda
mcshane.beta <- as.mcmc.list(mcshane.out$beta)
mcshane.mu <- as.mcmc.list(mcshane.out$mu)
mcshane.precsigma <- as.mcmc.list(mcshane.out$precsigma)

# basic coda plots!
par(mfrow = c(3,3), mar = c(5,4,4,2)/2)
densplot(mcshane.beta, ylim = c(0,20))

autocorr.plot(mcshane.beta, auto.layout = F)

cumuplot(mcshane.beta, auto.layout = F)

geweke.plot(mcshane.beta)

par(mfrow = c(1,1), mar = c(5,4,4,2)/2)
crosscorr.plot(mcshane.beta)

par(mfrow = c(3,3), mar = c(5,4,4,2)/2)
gelman.plot(mcshane.beta)

# now get results from the model for CRU temp
n.sims <- dim(mcshane.beta[[1]])[1]
n.years <- dim(mcshane.data)[1]
mcshane.predict <- matrix(0, nrow = n.sims, ncol = n.years)
for (i in 1:n.sims) {
  for (j in 1:n.years)
    mcshane.predict[i,j] <- rnorm(1, mcshane.mu[[1]][i,j], 1/sqrt(mcshane.precsigma[[1]][i]))
}

matplot(1850:1998+2, apply((mcshane.predict), 1, rev), type = "l", lty = 1, col = "lightgrey")
lines(1850:1998+2, rev(mcshane.data[-1,1]), col = "white", lwd = 2)
lines(supsmu(1850:1998+2, rev(apply(t(mcshane.predict), 1, mean))), col = "red", lwd = 2)

plot(supsmu(1850:1998, rev(apply(t(mcshane.predict), 1, function(x) quantile(x, .5)))), type = "l", ylim = c(-1, 1), lwd = 3, axes = F, xlab = "Year", ylab = "Temp Anomaly")
axis(1, at = seq(1850, 1998, by = 4), labels = seq(1850, 1998, by = 4))
axis(2, at = seq(-1,1,len=5), labels = seq(-1,1,len=5))
lines(supsmu(1850:1998, rev(apply(t(mcshane.predict), 1, function(x) quantile(x, .975)))), type = "l", lty = 2)
lines(supsmu(1850:1998, rev(apply(t(mcshane.predict), 1, function(x) quantile(x, .025)))), type = "l", lty = 2)
lines(1850:1998, rev(mcshane.data[,1]), col = "red", lwd = 2)
#legend("topleft", c("Smoothed Estimate", "Smoothed 95% Quantiles", "HADCRU NH Data"), col = c("black", "black", "red"), lty = c(1,2,1), lwd = c(3,1,2), box.lwd = 0)

## now do the prediction of the last thirty years
hold.range <- as.logical(c(rep(0, 30), rep(1, 119)))

hold.all <- list(y = y[hold.range], Lagy = Lagy[hold.range,],
                 PC = PC[hold.range,], nrows = sum(hold.range),
                 beta0 = beta0, precbeta = precbeta)

hold.jags <- jags.model(file = paste(d, "/mcshanejags.bug", sep = ""),
                        data = hold.all,
                        inits = mcshane.inits,
                        n.chain = 2)

# burn in
update(hold.jags, 1000)

# monitor
hold.out <- jags.samples(hold.jags, c("beta", "mu", "precsigma"),
                         n.iter = 1000, thin = 10)

hold.beta <- as.mcmc.list(hold.out$beta)
hold.mu <- as.mcmc.list(hold.out$mu)
hold.precsigma <- as.mcmc.list(hold.out$precsigma)

n.sims <- dim(hold.beta[[1]])[1]
n.years <- length(hold.all[[1]])
hold.predict <- matrix(0, nrow = n.sims, ncol = n.years)
for (i in 1:n.sims) {
  for (j in 1:n.years)
    hold.predict[i,j] <- rnorm(1, hold.mu[[1]][i,j], 1/sqrt(hold.precsigma[[1]][i]))
}

# non-parametric bootstrap
hold.sim <- 100
hold.forecast <- matrix(0, nrow = hold.sim, ncol = sum(!hold.range))
for (i in 1:hold.sim) {
  for (t in 120:149) {
    # process error
    mu <- sum(apply(hold.beta[[1]], 2, function(x) sample(x, 1)) *
              c(1, as.numeric(mcshane.data[149-t+1,-1])))
    # resid error
    hold.forecast[i,t-119] <- rnorm(1, mu,
                                    1/sqrt(sample(hold.precsigma[[1]], 1)))
  }
}

# mcshanefig18.png

points(supsmu(1850:1968, rev(apply(t(hold.predict), 1, function(x) quantile(x, .5)))), col = "orange", type = "l", lwd = 3)
lines(supsmu(1969:1998, (apply(t(hold.forecast), 1, function(x) quantile(x, .5)))), col = "orange", lwd = 3, type = "o")
lines(supsmu(1969:1998, (apply(t(hold.forecast), 1, function(x) quantile(x, .975)))), col = "orange", lwd = 1, lty = 2)
lines(supsmu(1969:1998, (apply(t(hold.forecast), 1, function(x) quantile(x, .025)))), col = "orange", lwd = 1, lty = 2)
legend("topleft", c("Smoothed Estimate", "Smoothed 95% Quantiles", "HADCRU NH Data", "Smoothed Estimate Interpolation w/ Holdout", "Smoothed Estimate Projection w/ Holdout", "Smoothed 95% Quantiles Projection w/ Holdout"), col = c("black", "black", "red", rep("orange",3)), lty = c(1,2,1,1,1,2), lwd = c(3,1,2,3,3,1), box.lwd = 0, pch = c(rep("", 4), "o", ""))

## update!!
# non-parameteric bootstrap part deux 1969:1998
hold.sim <- 1000
hold2.forecast <- matrix(0, nrow = hold.sim, ncol = sum(!hold.range))
for (i in 1:hold.sim) {
  for (t in 120:149) {
    # new time series data
    if (t == 120) # grab lag 1 and lag2 actual data
      dat <- mcshane.data[(149-t+1)+1:2,1]
    if (t == 121) # grab projected data and lag 2 actual data
      dat <- c(hold2.forecast[i,(t-1)-119], mcshane.data[(149-t+1)+2,1])
    if (t >= 122) # grab projected data
      dat <- hold2.forecast[i,(t-1):(t-2)-119]
    #  PC
    pc <- as.numeric(mcshane.data[149-t+1,-(1:3)])
    # process error
    mu <- sum(apply(hold.beta[[1]], 2, function(x) sample(x, 1)) *
              c(1, as.numeric(dat), pc))
    # resid error
    hold2.forecast[i,t-119] <- rnorm(1, mu,
                                     1/sqrt(sample(hold.precsigma[[1]], 1)))
  }
}

# mcshanefig18b.png
points(supsmu(1850:1968, rev(apply(t(hold.predict), 1, function(x) quantile(x, .5)))), col = "green", type = "l", lwd = 3)
lines(supsmu(1969:1998, (apply(t(hold2.forecast), 1, function(x) quantile(x, .5)))), col = "green", lwd = 3, type = "o")
lines(supsmu(1969:1998, (apply(t(hold2.forecast), 1, function(x) quantile(x, .975)))), col = "green", lwd = 1, lty = 2)
lines(supsmu(1969:1998, (apply(t(hold2.forecast), 1, function(x) quantile(x, .025)))), col = "green", lwd = 1, lty = 2)
legend("topleft", c("Smoothed Estimate", "Smoothed 95% Quantiles", "HADCRU NH Data", "Smoothed Estimate Interpolation w/ Holdout", "Smoothed Estimate Projection w/ Holdout", "Smoothed 95% Quantiles Projection w/ Holdout"), col = c("black", "black", "red", rep("green",3)), lty = c(1,2,1,1,1,2), lwd = c(3,1,2,3,3,1), box.lwd = 0, pch = c(rep("", 4), "o", ""))

# non-parameteric bootstrap part three
# wait, this is backcast!
hold.sim <- 1000
inits <- as.numeric(mcshane.data[1,2:3]) # say known
hold3.forecast <- matrix(0, nrow = hold.sim, ncol = sum(!hold.range))
for (i in 1:hold.sim) {
  for (t in 1:30) {
    if (t == 1)
      dat <- inits
    if (t == 2)
      dat <- c(hold3.forecast[i,t-1], inits[2])
    if (t >= 3)
      dat <- hold3.forecast[i,t-(1:2)]
    #  PC
    pc <- as.numeric(mcshane.data[t,-(1:3)])
    # process error
    mu <- sum(apply(hold.beta[[1]], 2, function(x) sample(x, 1)) *
              c(1, as.numeric(dat), pc))
    # resid error
    hold3.forecast[i,t] <- rnorm(1, mu,
                                 1/sqrt(sample(hold.precsigma[[1]], 1)))
  }
}

# mcshanefig18c.png
points(supsmu(1850:1968, rev(apply(t(hold.predict), 1, function(x) quantile(x, .5)))), col = "green", type = "l", lwd = 3)
lines(supsmu(1969:1998, rev(apply(t(hold3.forecast), 1, function(x) quantile(x, .5)))), col = "green", lwd = 3, type = "o")
lines(supsmu(1969:1998, rev(apply(t(hold3.forecast), 1, function(x) quantile(x, .975)))), col = "green", lwd = 1, lty = 2)
lines(supsmu(1969:1998, rev(apply(t(hold3.forecast), 1, function(x) quantile(x, .025)))), col = "green", lwd = 1, lty = 2)
legend("topleft", c("Smoothed Estimate", "Smoothed 95% Quantiles", "HADCRU NH Data", "Smoothed Estimate Interpolation w/ Holdout", "Smoothed Estimate Projection w/ Holdout", "Smoothed 95% Quantiles Projection w/ Holdout"), col = c("black", "black", "red", rep("green",3)), lty = c(1,2,1,1,1,2), lwd = c(3,1,2,3,3,1), box.lwd = 0, pch = c(rep("", 4), "o", ""))

# part four!
# now project over the whole interval!
hold.sim <- 400
inits <- as.numeric(mcshane.data[1,2:3]) # say known
hold4.forecast <- matrix(0, nrow = hold.sim, ncol = 149)
for (i in 1:hold.sim) {
  for (t in 1:149) {
    if (t == 1)
      dat <- inits
    if (t == 2)
      dat <- c(hold4.forecast[i,t-1], inits[2])
    if (t >= 3)
      dat <- hold4.forecast[i,t-(1:2)]
    #  PC
    pc <- as.numeric(mcshane.data[t,-(1:3)])
    # process error
    mu <- sum(apply(hold.beta[[1]], 2, function(x) sample(x, 1)) *
              c(1, as.numeric(dat), pc))
    # resid error
    hold4.forecast[i,t] <- rnorm(1, mu,
                                 1/sqrt(sample(hold.precsigma[[1]], 1)))
  }
}

# mcshanefig18d.png
lines(supsmu(1850:1998, rev(apply(t(hold4.forecast), 1, function(x) quantile(x, .5)))), col = "orange", lwd = 3, type = "l")
lines(supsmu(1850:1998, rev(apply(t(hold4.forecast), 1, function(x) quantile(x, .975)))), col = "orange", lwd = 1, lty = 2)
lines(supsmu(1850:1998, rev(apply(t(hold4.forecast), 1, function(x) quantile(x, .025)))), col = "orange", lwd = 1, lty = 2)
legend("topleft", c("Smoothed Estimate", "Smoothed 95% Quantiles", "HADCRU NH Data", "Smoothed Estimate w/ Holdout", "Smoothed 95% Quantiles w/ Holdout"), col = c("black", "black", "red", rep("orange",2)), lty = c(1,2,1,1,2), lwd = c(3,1,2,3,1), box.lwd = 0, pch = c(rep("", 3), "", ""))

Just be cognizant of where to put the files, and the above code plots the results (also get rid of the superfluous last two lines in the original data).

I first plot the Bayesian estimates for the whole dataset. Uncertainty bounds are for all of the parameters (which means it’s not a confidence interval, but a prediction interval, in frequentist language).

The model fits pretty well in general, and I got good estimates without too many samples.

M&W2010 also did a test, where they held a 30 year block away from the data, in order to project the temperature time series given the model and the limited data. This process wasn’t described well in the paper, but I am presuming they just did a nonparametric bootstrap off the posterior distributions (Update: the way I calculate the projection is flawed, hopefully I’ll have a post on this later).

Update2: I was bootstrapping over the actual data of the most recent 30 year period. This was obviously wrong, and I’ve replaced it with the projected values. New graph is reflected in the code, just search for part deux. Interestingly enough, the projection looks even crappier in terms of predicting recent warming. M&W2010 interprets this as crappy proxies, climatologists interpret this as a regime change in the atmospheric dynamics (old graph here).

Update3: oops, didn’t consider for the backcast (old graph here). See part three in code.

Update4: last update, did the projection over the whole interval (old graph here). See part four in code.

The following is the result:

I think this is pretty much identical to Fig. 18b in the paper. One thing to notice is that I smoothed over the estimates and the 95% intervals. M&W2010 weirdly draws over all instances of the simulations (notice the wide grey intervals). Since all the realizations of the model are overlapping each other, the graph effectively shows the maximum and minimum of the 95% interval. Where my interval will probably be interpreted as about +/- 0.4 on each side, M&W2010′s paper could be interpreted as +/- 0.5 degrees. This is a large difference especially for a paper that espouses large uncertainties in the proxy data. OTOH, Mann et al. (2008) I think produces mean uncertainty bounds which might be confusing depending on one’s perspective (or I guess I should say it’s confusing if you’re going to compare the two!).

Apparently there is another paper (w/ code and data) coming out soon that uses Bayesian hierarchical models that I might recreate as well (Bo Li1, Douglas W. Nychka2 and Caspar M. Ammann. 2010. The Value of Multi-proxy Reconstruction of Past Climate). More fun to chew on.

Refs:

BLAKELEY B. MCSHANE AND ABRAHAM J. WYNER (2010). A STATISTICAL ANALYSIS OF MULTIPLE TEMPERATURE PROXIES: ARE RECONSTRUCTIONS OF SURFACE TEMPERATURES OVER THE LAST 1000 YEARS RELIABLE? Annals of Applied Statistics, 4 (3) (link)
Mann, M., Zhang, Z., Hughes, M., Bradley, R., Miller, S., Rutherford, S., & Ni, F. (2008). Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia Proceedings of the National Academy of Sciences, 105 (36), 13252-13257 DOI: 10.1073/pnas.0805721105