This year we had the privelage of sponsoring StanCon. Unfortunately, we weren’t able to actually attend the conference. Rather than let our ticket go to waste, we ran a small competition, which Ignacio Martinez won with his very cool (but in alpha stage) R package – see gif above.
Highlights from StanCon 2018
During my econ PhD I learned a lot about frequentist statistics. Alas, my training of Bayesian statistics was limited. Three years ago, I joined @MathPolResearch and started delving into this whole new world. Two weeks ago, thanks to @jumping_uk, I was able to attend StanCon. This was an amazing experience, which allowed me to meet some great people and learn a lot from them. These are my highlights from the conference:
You’d better have a very good reason to not use hierarchical models. Ben Goodrich’s tutorial on advanced hierarchical models was great. Most social science data has a natural hierarchy and modeling it using Stan is easy! Slides for this three day tutorial are available here: [day 1, day 2, day 3].
Everyone should take his or her model to the loo. @avehtari’s excellent tutorial covered cross-validation, reference predictive and projection predictive approaches for model assessment, selection and inference after model selection. This tutorial is available online, and everyone using Stan should do it.
Bob Carpenter‘s tutorial on how to verify fit and diagnose convergence answered many practical and theoretical questions I had. Bob did a great job explaining how the effective sample sizes and potential scale reduction factors (’R hats’) are calculated. He also gives us some practical rules:
- We want R hat to be less than 1.05 and greater than 0.9
- R hat equal to 1 does not guarantee convergence
- An effective sample size between 50 and 100 is enough
- Don’t be afraid to ask questions on the Stan forum
The Bayesian Decision Making for Executives and Those who Communicate with Them series by Eric Novik and Jonathan Auerbach had some very good advice:
- Before model building, ask: What decisions are you trying to make? What is the cost of the wrong decision? What is the gain from a good decision?
- During model building: Elicit enough information about the problem so that a generative model can be expressed. This is very hard. A lot depends on the industry (e.g., book publishers are very different from pharma companies).
- After the model has been fit: Communicate the results so stakeholders can make a decision. Some things to keep in mind when doing so include:
- Stakeholders should not care about p-values, Bayes factors or ROC curves (but sometimes do).
- Stakeholders should care about the uncertainty in your estimates, but often they do not.
- Stakeholders should know their loss or utility function, but they often do not.
To sum this up, the Stan developers are an incredibly talented and generous group of people that have created a useful and flexible programing language and a fantastic community around it. I look forward to future StanCons. A few other things that I am looking forward to in the nearer future (and I underStand are coming soon…):
- A series of Coursera massive open online courses (MOOCs)
- Support for parallel computing with MPI and GPUs
- loo 2.0