Stan 1.2.0 and RStan 1.2.0

[This article was first published on Statistical Modeling, Causal Inference, and Social Science » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Stan 1.2.0 and RStan 1.2.0 are now available for download. See:

Here are the highlights.

Full Mass Matrix Estimation during Warmup

Yuanjun Gao, a first-year grad student here at Columbia (!), built a regularized mass-matrix estimator. This helps for posteriors with high correlation among parameters and varying scales. We’re still testing this ourselves, so the estimation procedure may change in the future (don’t worry — it satisfies detailed balance as is, but we might be able to make it more computationally efficient in terms of time per effective sample).

It’s not the default option. The major reason is the matrix operations required are expensive, raising the algorithm cost to {\mathcal O}(k m n^2 +  n^3 \log m), where k is the average number of leapfrog steps, m is the number of iterations, and n is the number of parameters.

Yuanjun did a great job with the Cholesky factorizations and implemented this about as efficiently as is possible. (His homework for Andrew’s class was also the inspiration for the Gaussian process models in the manual.)

It’s integrated with NUTS.

Cumulative Distribution Functions

The practical upshot is that Stan supports more truncated distributions, and hence more truncated and censored data models.

Michael Betancourt did the heavy lifting here, which involved a crazy amount of “special function” derivative calculations and implementations. Everyone knows that the derivative of a distribution function with respect to the variate is the density. But what about the partials with respect to the other parameters? We’ll be documenting all of the functions and derivatives in the manual.

Daniel Lee generalized the entire density and distribution function testing framework to generate code for tests. We’re doing much more extensive tests of the vectorizations and derivatives. Also, Daniel implemented efficient vectorized derivatives for many more of the density functions.

Model Log Probability and Derivatives in R

Jiqiang Guo, who’s at the helm of RStan, wrote code to allow users to access the log probability function in a Stan model and its gradients directly. The functions are parameterized with the unconstrained parameterization of a Stan model with support on all of R^N. He also exposed the model functions to convert back and forth between the constrained and unconstrained parameterizations for initialization and interpretation of the samples.

David Blei suggested that if we added this feature, people could do interesting things in R with it, such as optimization. Let us know if you find it helpful.

Print Posterior Summary Statistics from Command Line

Daniel Lee wrote a program to print a summary of one or more chains from the command line, mirroring the print() command of RStan.

Bug Fixes

We also fixed a bad memory leak in multivariate operations that was introduced in the last release when we optimized the matrix operations for derivative calculations. We also fixed the Windows issue with conservative matrix resizing which caused multivariate models to crash under Windows at optimization levels above 0.

The Future

There hass been a lot of activity in various branches that haven’t been merged into the trunk yet, so stay tuned.

Release Notes
v1.2.0 (6 March 2012)

* full mass matrix estimation during warmup
* expose model log_prob and gradient functions in RStan for use
  in other packages (such as optimizers)
* command-line program to display output from multiple chains
  with parameter-by-parameter mean, se, sd, quantiles, and R-hat
* probability function speed improvements with vectorization
* created Stan contributed repositories for user-contributed
  and experimental features (first entry is an emacs mode)
* modified makefiles so targets are the same under Windows,
  Linux, and Mac

New Functions
* most of the cumulative distribution functions (see the documentation
  index for the full list of supported functions)
* added monitor() function in RStan

Bug Fixes
* disabled Boost asserts in parser to quiet R's warnings
* enabled prints in generated quantities block
* various documentation patches
* fixed memory leak in matrix operations leading to leaks in
  multivariate probability function use
* wrapped call to gradient log prob to catch unexpected exceptions
* fixed matrix resize issue on Windows that caused models to fail
  at optimization levels above 0
* fixed bug in print preventing hyphens or grave accents from
* fixed issue preventing matrix rows from being assigned on the
  left side of an assignment statement
* clearer error messages on matrix and other function arguments

The post Stan 1.2.0 and RStan 1.2.0 appeared first on Statistical Modeling, Causal Inference, and Social Science.

To leave a comment for the author, please follow the link and comment on their blog: Statistical Modeling, Causal Inference, and Social Science » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)