Parallel JAGS RNGs

July 23, 2011

(This article was first published on Statistical Modeling, Causal Inference, and Social Science, and kindly contributed to R-bloggers)

As a matter of convention, we usually run 3 or 4 chains in JAGS. By default, this gives rise to chains that draw samples from 3 or 4 distinct pseudorandom number generators. I didn’t go and check whether it does things 111,222,333 or 123,123,123, but in any event the “parallel chains” in JAGS are samples drawn from distinct RNGs computed on a single processor core.

But we all have multiple cores now, or we’re computing on a cluster or the cloud! So the behavior we’d like from rjags is to use the foreach package with each JAGS chain using a parallel-safe RNG. The default behavior with n.chain=1 will be that each parallel instance will use[1], the Wichmann-Hill RNG.

JAGS 2.2.0 includes a new lecuyer module (along with the glm module, which everyone should probably always use, and doesn’t have many undocumented tricks that I know of). But lecuyer is completely undocumented! I tried"lecuyer::Lecuyer","lecuyer::lecuyer", and"lecuyer::LEcuyer"
all to no avail. It ought to be"lecuyer::Lecuyer" to be consistent with the other values! I looked around in the source to find where it checks its name from the inits, to discover that in fact it is"lecuyer::RngStream"

So here’s how I set up 4 chains now:

library(doMC); registerDoMC()
library(rjags); load.module("glm"); load.module("lecuyer")
jinits <- function() {
   ### all the other params ###"lecuyer::RngStream",
  .Rng.seed=randomNumbers(n = 1, min = 1, max = 1e+06,col=1)
jags.parsamples <- foreach(i=1:getDoParWorkers()) %dopar% {
  model.jags <- jags.model(model, forJAGS,
                           n.chain=1, n.adapt=1000)
  result <- coda.samples(model.jags,params,1000)

I would just as soon initialize them to the same state and use sequential substreams, but I think there is no way to do this. Four long separately-seeded streams should be more than fine; a quick look suggests that if you did n.chain>1 (on each core) you’d get sequential substreams.

I should also probably write a better .combine so that it’s an mcmc.list and not just a list, but whatever. This works, almost 4 times (yeah yeah overhead blah blah) faster than the usual n.chain=4 would be!

To leave a comment for the author, please follow the link and comment on their blog: Statistical Modeling, Causal Inference, and Social Science. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , ,

Comments are closed.

Search R-bloggers

Most visited articles of the week

  1. How to write the first for loop in R
  2. R Studio Shortcuts and Tips – part 2
  3. Learning R: The Ultimate Introduction (incl. Machine Learning!)
  4. Modern Data Science with R: A review
  5. 5 Ways to Subset a Data Frame in R
  6. Part 2: Simple EDA in R with inspectdf
  7. R – Sorting a data frame by the contents of a column
  8. Using apply, sapply, lapply in R
  9. Installing R packages


RSS Jobs for R users

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)