Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As a matter of convention, we usually run 3 or 4 chains in JAGS. By default, this gives rise to chains that draw samples from 3 or 4 distinct pseudorandom number generators. I didn’t go and check whether it does things 111,222,333 or 123,123,123, but in any event the “parallel chains” in JAGS are samples drawn from distinct RNGs computed on a single processor core.
But we all have multiple cores now, or we’re computing on a cluster or the cloud! So the behavior we’d like from rjags is to use the foreach package with each JAGS chain using a parallel-safe RNG. The default behavior with n.chain=1 will be that each parallel instance will use .RNG.name[1], the Wichmann-Hill RNG.
JAGS 2.2.0 includes a new lecuyer module (along with the glm module, which everyone should probably always use, and doesn’t have many undocumented tricks that I know of). But lecuyer is completely undocumented! I tried .RNG.name="lecuyer::Lecuyer", .RNG.name="lecuyer::lecuyer", and .RNG.name="lecuyer::LEcuyer"
all to no avail. It ought to be .RNG.name="lecuyer::Lecuyer" to be consistent with the other .RNG.name values! I looked around in the source to find where it checks its name from the inits, to discover that in fact it is
.Rng.name="lecuyer::RngStream"
So here’s how I set up 4 chains now:
library(doMC); registerDoMC()
library(rjags); load.module("glm"); load.module("lecuyer")
library(random)
jinits <- function() {
### all the other params ###
.Rng.name="lecuyer::RngStream",
.Rng.seed=randomNumbers(n = 1, min = 1, max = 1e+06,col=1)
}
jags.parsamples <- foreach(i=1:getDoParWorkers()) %dopar% {
model.jags <- jags.model(model, forJAGS,
inits=jinits,
n.chain=1, n.adapt=1000)
result <- coda.samples(model.jags,params,1000)
return(result)
}
I would just as soon initialize them to the same state and use sequential substreams, but I think there is no way to do this. Four long separately-seeded streams should be more than fine; a quick look suggests that if you did n.chain>1 (on each core) you’d get sequential substreams.
I should also probably write a better .combine so that it’s an mcmc.list and not just a list, but whatever. This works, almost 4 times (yeah yeah overhead blah blah) faster than the usual n.chain=4 would be!
