1. Improve sampler efficiency by picking the right model
We need to address how we specify our models before even discussing writing code that is optimized computationally. Make sure that your model is parameterized in such a way that Stan can easily sample it using its algorithms – No U-TURN Sampler (NUTS) or Hamiltonian Monte Carlo (HMC). Simply selecting the right model can make a big difference in terms of efficiency. Due to differences in implementations and algorithms, an efficient parameterization in stan is not necessarily the one that was best in other software we’ve tried.
A poorly specified model will require more samples to reach convergence and adequately explore the posterior distribution or they may not converge at all. For some models, reparameterization can be an effective means of improve sampling efficiency by replacing a complex distribution Stan has difficulty sampling from, with one from which it can draw more easily. One example discussed in Section 28.6 of the Stan manual involves reparameterizing the Cauchy distribution, a challenge for Stan to sample from because of the heavy tails. The difficulties can be fixed by instead sampling a uniform random variable and using the probability integral transform.
2. Matrices or arrays, pick carefully!
It can also be confusing whether to use matrices or arrays when writing your code. There are actually four different ways to specify a two-dimensional collection of real numbers! But which one should you pick? This largely depends on the operations you need to perform. If you need to do matrix computations, you should be using a matrix. However, if you frequently need to index into the rows of the matrix it is more efficient to use arrays. In this situation, it will save you a headache to declare an array of type row_vector than to work with matrices.
Matrices and arrays should also be traversed in different order. Loops involving arrays should be traversed with the last index varying fastest, whereas the opposite is true for matrices. Additionally, traversing through matrices is slightly more efficient. If for example your code involves an I x J array of matrices each having dimension R x C, then the most efficient way to write a loop that traverses every element is:
for (i in 1:I)
for (j in 1:J)
for (c in 1:C)
for (r in 1:R)
a[i,j,r,c] = ……
3. Let built-in functions and vectorization save you time
Stan has a number of optimized built-in functions for common computations such as dot products and special matrix multiplications. You should use these whenever possible to save yourself from having to write your own code to perform the calculations. There are also functions that will improve the speed by vectorizing code. For example, this loop
for(r in 1:R)
m[r] = normal(0,1);
should be replaced with the faster, vectorized code:
to_vector(m) ~ normal(0,1);
4. Customize with compiler optimizations
Because Stan relies on compiled C++ code, it may be possible for advanced users to further optimize by changing compiler flags in R’s version of a Makefile, known as a Makevars file. It is recommended to do this in a user specific file located in ~/.R/Makevars. Be careful though – using overly aggressive compiler options can result in code that is not portable across machines and architectures. The trade-off however is code that is as fast as possible on your own computer. The RStan installation guide recommends adding CXXFLAGS==-O3 to the Makevars file for the highest level of overall optimization possible, but be warned – this can result in increased memory usage and larger binary file sizes!
We hope these four tips help you improve the efficiency of your coded models! Check out more tips and tricks here.