**R – Statistical Modeling, Causal Inference, and Social Science**, and kindly contributed to R-bloggers)

This post is by Jonah and Aki.

We’re happy to announce the release of v2.0.0 of the **loo** R package for efficient approximate leave-one-out cross-validation (and more). For anyone unfamiliar with the package, the original motivation for its development is in our paper:

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC.

Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4. (published version, arXiv preprint)

Version 2.0.0 is a major update (release notes) to the package that we’ve been working on for quite some time and in this post we’ll highlight some of the most important improvements. Soon I (Jonah) will follow up with a post about important new developments in our various other R packages.

**New interface, vignettes, and more helper functions to make the package easier to use**

Because of certain improvements to the algorithms and diagnostics (summarized below), the interfaces, i.e., the `loo()`

and `psis()`

functions and the objects they return, also needed some improvement. (Click on the function names in the previous sentence to see their new documentation pages.) Other related packages in the Stan R ecosystem (e.g., **rstanarm**, **brms**, **bayesplot**, **projpred**) have also been updated to integrate seamlessly with **loo** v2.0.0. (Apologies to anyone who happened to install the update during the short window between the **loo** release and when the compatible rstanarm/brms binaries became available on CRAN.)

Three vignettes now come with the **loo** package package and are also available (and more nicely formatted) online at mc-stan.org/loo/articles:

*Using the loo package (version >= 2.0.0)*(view)*Bayesian Stacking and Pseudo-BMA weights using the loo package*(view)*Writing Stan programs for use with the loo package*(view)

A vignette about K-fold cross-validation using new K-fold helper functions will be included in a subsequent update. Since the last release of **loo** we have also written a paper, Visualization in Bayesian workflow, that includes several visualizations based on computations from **loo**.

**Improvements to the PSIS algorithm, effective sample sizes and MC errors**

The approximate leave-one-out cross-validation performed by the **loo** package depends on Pareto smoothed importance sampling (PSIS). In **loo** v2.0.0, the PSIS algorithm (`psis()`

function) corresponds to the algorithm in the most recent update to our PSIS paper, including adapting the Pareto fit with respect to the effective sample size and using a weakly informative prior to reduce the variance for small effective sample sizes. (I believe we’ll be updating the paper again with some proofs from new coauthors.)

For users of the **loo** package for PSIS-LOO cross-validation and not just the PSIS algorithm for importance sampling, an even more important update is that the latest version of the same PSIS paper referenced above describes how to compute the effective sample size estimate and Monte Carlo error for the PSIS estimate of `elpd_loo`

(expected log predictive density for new data). Thus, in addition to the Pareto k diagnostic (an indicator of convergence rate – see paper) already available in previous **loo** versions, we now also report an effective sample size that takes into account both the MCMC efficiency and the importance sampling efficiency. Here’s an example of what the diagnostic output table from **loo** v2.0.0 looks like (the particular intervals chosen for binning are explained in the papers and also the package documentation) for the diagnostics:

Pareto k diagnostic values: Count Pct. Min. n_eff (-Inf, 0.5] (good) 240 91.6% 205 (0.5, 0.7] (ok) 7 2.7% 48 (0.7, 1] (bad) 8 3.1% 7 (1, Inf) (very bad) 7 2.7% 1

We also compute and report the Monte Carlo SE of `elpd_loo`

to give an estimate of the accuracy. If some k>1 (which means the PSIS-LOO approximation is not reliable, as in the example above) NA will be reported for the Monte Carlo SE. We hope that showing the relationship between the k diagnostic, effective sample size, and and MCSE of `elpd_loo`

will make it easier to interpret the diagnostics than in previous versions of **loo** that only reported the k diagnostic.** **This particular example is taken from one of the new vignettes, which uses it as part of a comparison of unstable and stable PSIS-LOO behavior.

**Weights for model averaging: Bayesian stacking, pseudo-BMA and pseudo-BMA+**

Another major addition is the `loo_model_weights()`

function, which, thanks to the contributions of Yuling Yao, can be used to compute weights for model averaging or selection. `loo_model_weights()`

provides a user friendly interface to the new `stacking_weights()`

and `pseudobma_weights()`

, which are implementations of the methods from Using stacking to average Bayesian predictive distributions (Yao et al., 2018). As shown in the paper, Bayesian stacking (the default for `loo_model_weights()`

) provides better model averaging performance than “Akaike style“ weights, however, the **loo **package does also include Pseudo-BMA weights (PSIS-LOO based “Akaike style“ weights) and Pseudo-BMA+ weights, which are similar to Pseudo-BMA weights but use a so-called Bayesian bootstrap procedure to better account for the uncertainties. We recommend the Pseudo-BMA+ method instead of, for example, WAIC weights, although we prefer the stacking method to both. In addition to the Yao et al. paper, the new vignette about computing model weights demonstrates some of the motivation for our preference for stacking when appropriate.

**Give it a try**

You can install **loo** v2.0.0 from CRAN with `install.packages("loo")`

. Additionally, reinstalling an interface that provides **loo** functionality (e.g., **rstanarm**,** ****brms**) will automatically update your **loo** installation. The **loo** website with online documentation is mc-stan.org/loo and you can report a bug or request a feature on GitHub.

The post loo 2.0 is loose appeared first on Statistical Modeling, Causal Inference, and Social Science.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Statistical Modeling, Causal Inference, and Social Science**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...