Stan 1.2.0 and RStan 1.2.0 are now available for download. See:
Here are the highlights.
Full Mass Matrix Estimation during Warmup
Yuanjun Gao, a first-year grad student here at Columbia (!), built a regularized mass-matrix estimator. This helps for posteriors with high correlation among parameters and varying scales. We’re still testing this ourselves, so the estimation procedure may change in the future (don’t worry — it satisfies detailed balance as is, but we might be able to make it more computationally efficient in terms of time per effective sample).
It’s not the default option. The major reason is the matrix operations required are expensive, raising the algorithm cost to , where is the average number of leapfrog steps, is the number of iterations, and is the number of parameters.
Yuanjun did a great job with the Cholesky factorizations and implemented this about as efficiently as is possible. (His homework for Andrew’s class was also the inspiration for the Gaussian process models in the manual.)
It’s integrated with NUTS.
Cumulative Distribution Functions
The practical upshot is that Stan supports more truncated distributions, and hence more truncated and censored data models.
Michael Betancourt did the heavy lifting here, which involved a crazy amount of “special function” derivative calculations and implementations. Everyone knows that the derivative of a distribution function with respect to the variate is the density. But what about the partials with respect to the other parameters? We’ll be documenting all of the functions and derivatives in the manual.
Daniel Lee generalized the entire density and distribution function testing framework to generate code for tests. We’re doing much more extensive tests of the vectorizations and derivatives. Also, Daniel implemented efficient vectorized derivatives for many more of the density functions.
Model Log Probability and Derivatives in R
Jiqiang Guo, who’s at the helm of RStan, wrote code to allow users to access the log probability function in a Stan model and its gradients directly. The functions are parameterized with the unconstrained parameterization of a Stan model with support on all of R^N. He also exposed the model functions to convert back and forth between the constrained and unconstrained parameterizations for initialization and interpretation of the samples.
David Blei suggested that if we added this feature, people could do interesting things in R with it, such as optimization. Let us know if you find it helpful.
Print Posterior Summary Statistics from Command Line
Daniel Lee wrote a program to print a summary of one or more chains from the command line, mirroring the print() command of RStan.
We also fixed a bad memory leak in multivariate operations that was introduced in the last release when we optimized the matrix operations for derivative calculations. We also fixed the Windows issue with conservative matrix resizing which caused multivariate models to crash under Windows at optimization levels above 0.
There hass been a lot of activity in various branches that haven’t been merged into the trunk yet, so stay tuned.
v1.2.0 (6 March 2012) ====================================================================== Enhancements ---------------------------------- * full mass matrix estimation during warmup * expose model log_prob and gradient functions in RStan for use in other packages (such as optimizers) * command-line program to display output from multiple chains with parameter-by-parameter mean, se, sd, quantiles, and R-hat * probability function speed improvements with vectorization * created Stan contributed repositories for user-contributed and experimental features (first entry is an emacs mode) * modified makefiles so targets are the same under Windows, Linux, and Mac New Functions ---------------------------------- * most of the cumulative distribution functions (see the documentation index for the full list of supported functions) * added monitor() function in RStan Bug Fixes ---------------------------------- * disabled Boost asserts in parser to quiet R's warnings * enabled prints in generated quantities block * various documentation patches * fixed memory leak in matrix operations leading to leaks in multivariate probability function use * wrapped call to gradient log prob to catch unexpected exceptions * fixed matrix resize issue on Windows that caused models to fail at optimization levels above 0 * fixed bug in print preventing hyphens or grave accents from priting * fixed issue preventing matrix rows from being assigned on the left side of an assignment statement * clearer error messages on matrix and other function arguments