September 2, 2012

(This article was first published on JAGS News » R, and kindly contributed to R-bloggers)

Andrew Gelman has announced the release of Stan version 1.0.0 and its R interface RStan.  Stan – named after Stanislaw Ulam, the inventor of the Monte Carlo method – is a new MCMC program that represents a major technological leap forward. It works flawlessly on my Linux desktop and is very, very fast. (Note that I have nothing to do with the Stan project: I am just posting this here because it is clearly of interest to JAGS users).

The starting point for Stan was that some large hierarchical models are intractable by Gibbs sampling.  Everyone has come across examples in which the Gibbs sampler will not converge in a reasonable time because the variables are highly correlated. Older BUGS users will remember re-parametrizing their models to reduce this correlation, e.g. by centring predictor variables like this:

x.mean <- mean(x[])
for (i in 1:N) {
   y[i] ~ dnorm(mu[i], tau)
   mu[i] <- alpha + beta * (x[i] - x.mean)

OpenBUGS introduced updaters that sample blocks of correlated parameters together.  The glm module in JAGS also uses block sampling for more efficient sampling of generalized linear models.  Stan takes this approach to its logical conclusion by using Hamiltonian Monte Carlo [1] to do block updating of all the parameters in the model together.  It can thereby tackle problems that remain beyond the scope of OpenBUGS and JAGS.

Stan works a bit differently from OpenBUGS and JAGS. It takes a description of the model and converts it into C++ code, then compiles the C++ program into a standalone executable file. This executable is run on a data set to produce the sampled output.  The model description used by Stan is a BUGS-like language, which BUGS users should find easy to read.  To help BUGS users further, the Stan User’s Guide contains an appendix that highlights the similarities and differences between Stan and BUGS.  The Stan distribution also contains the familiar BUGS examples converted to Stan format.

Despite the similarities, there are still some surprises for BUGS users. Since the model description is converted to C++ before being compiled, the programming style is imperative, not declarative. This means you can do some things that are impossible in BUGS, such as defining multiple likelihoods for the same variable

y ˜ normal(0,1);
y ˜ normal(2,3);

The contributions to the log-likelihood are added together.  You can also have local variables that are modified inside a for loop:

   total <- 0;
   for (i in 1:n) {
      theta[i] ˜ normal(total, sigma);
      total <- total + theta[i];

Rethinking everything from the ground up has allowed the Stan developers to correct some of the design problems of BUGS, such as the counter-intuitive parametrization of distributions in terms of the precision rather than the standard deviation or variance. This was important in the early days of BUGS to allow efficient Gibbs sampling of conjugate distributions. However, nobody can express an informative prior in terms of the precision, which leads to extensive use of parameter transformations and rather verbose BUGS code.

That’s about all I have to say at this point. Stan 1.0.0 was only released two days ago so I have not had much time to explore further. To find out more, visit the Stan web site:

[1] For an explanation of Hamiltonian Monte Carlo, see the introduction by Radford Neal in The Handbook of Markov Chain Monte Carlo. The sampler used by Stan is the No U-Turn Sampler, or NUTS.

To leave a comment for the author, please follow the link and comment on their blog: JAGS News » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)