Parallel JAGS RNGs

July 23, 2011

(This article was first published on Statistical Modeling, Causal Inference, and Social Science, and kindly contributed to R-bloggers)

As a matter of convention, we usually run 3 or 4 chains in JAGS. By default, this gives rise to chains that draw samples from 3 or 4 distinct pseudorandom number generators. I didn’t go and check whether it does things 111,222,333 or 123,123,123, but in any event the “parallel chains” in JAGS are samples drawn from distinct RNGs computed on a single processor core.

But we all have multiple cores now, or we’re computing on a cluster or the cloud! So the behavior we’d like from rjags is to use the foreach package with each JAGS chain using a parallel-safe RNG. The default behavior with n.chain=1 will be that each parallel instance will use[1], the Wichmann-Hill RNG.

JAGS 2.2.0 includes a new lecuyer module (along with the glm module, which everyone should probably always use, and doesn’t have many undocumented tricks that I know of). But lecuyer is completely undocumented! I tried"lecuyer::Lecuyer","lecuyer::lecuyer", and"lecuyer::LEcuyer"
all to no avail. It ought to be"lecuyer::Lecuyer" to be consistent with the other values! I looked around in the source to find where it checks its name from the inits, to discover that in fact it is"lecuyer::RngStream"

So here’s how I set up 4 chains now:

library(doMC); registerDoMC()
library(rjags); load.module("glm"); load.module("lecuyer")
jinits <- function() {
   ### all the other params ###"lecuyer::RngStream",
  .Rng.seed=randomNumbers(n = 1, min = 1, max = 1e+06,col=1)
jags.parsamples <- foreach(i=1:getDoParWorkers()) %dopar% {
  model.jags <- jags.model(model, forJAGS,
                           n.chain=1, n.adapt=1000)
  result <- coda.samples(model.jags,params,1000)

I would just as soon initialize them to the same state and use sequential substreams, but I think there is no way to do this. Four long separately-seeded streams should be more than fine; a quick look suggests that if you did n.chain>1 (on each core) you’d get sequential substreams.

I should also probably write a better .combine so that it’s an mcmc.list and not just a list, but whatever. This works, almost 4 times (yeah yeah overhead blah blah) faster than the usual n.chain=4 would be!

To leave a comment for the author, please follow the link and comment on his blog: Statistical Modeling, Causal Inference, and Social Science. offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , ,

Comments are closed.

Top 3 Posts from the past 2 days

Top 9 articles of the week

  1. Installing R packages
  2. Hypothesis Testing: Fishing for Trouble
  3. In-depth introduction to machine learning in 15 hours of expert videos
  4. Scatterplots
  5. Using apply, sapply, lapply in R
  6. Machine Learning in R for beginners
  7. Regression Models, It’s Not Only About Interpretation
  8. Basics of Histograms
  9. I'm all about that bootstrap ('bout that bootstrap)