bayes.js: A Small Library for Doing MCMC in the Browser
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Bayesian data analysis is cool, Markov chain Monte Carlo is the cool technique that makes Bayesian data analysis possible, and wouldn’t it be coolness if you could do all of this in the browser? That was what I thought, at least, and I’ve now made bayes.js: A small JavaScript library that implements an adaptive MCMC sampler and a couple of probability distributions, and that makes it relatively easy to implement simple Bayesian models in JavaScript.
Here is a motivating example: Say that you have the heights of the last ten American presidents…
<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #666666">//</span> The heights of the last ten American presidents <span style="color: #008000; font-weight: bold">in</span> cm, from Kennedy to Obama var heights <span style="color: #666666">=</span> [<span style="color: #666666">183</span>, <span style="color: #666666">192</span>, <span style="color: #666666">182</span>, <span style="color: #666666">183</span>, <span style="color: #666666">177</span>, <span style="color: #666666">185</span>, <span style="color: #666666">188</span>, <span style="color: #666666">188</span>, <span style="color: #666666">182</span>, <span style="color: #666666">185</span>]; </pre></div>
… and that you would like to fit a Bayesian model assuming a Normal distribution to this data. Well, you can do that right now by clicking “Start sampling” below! This will run an MCMC sampler in your browser implemented in JavaScript.
function loadScript(url, callback) { // Adding the script tag to the head as suggested before var head = document.getElementsByTagName('head')[0]; var script = document.createElement('script'); script.type = 'text/javascript'; script.src = url;
// Then bind the event to the callback function. // There are several events for cross browser compatibility. script.onreadystatechange = callback; script.onload = callback;
// Fire the loading head.appendChild(script); }
var clear_samples; var sample_loop; var stop_sample_loop;
loadScript("https://cdn.plot.ly/plotly-1.3.0.min.js", function(){ loadScript("/files/posts/2015-12-31-bayes-js-a-small-library-for-doing-mcmc-in-the-browser/mcmc.js", function(){ loadScript("/files/posts/2015-12-31-bayes-js-a-small-library-for-doing-mcmc-in-the-browser/distributions.js", function(){
var heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185];
var params = { mu: {type: "real"}, sigma: {type: "real", lower: 0}};
var log_post = function(state, heights) { var log_post = 0; // Priors (here sloppy and vague...) log_post += ld.norm(state.mu, 0, 1000); log_post += ld.unif(state.sigma, 0, 1000); // Likelihood for(var i = 0; i < heights.length; i++) { log_post += ld.norm(heights[i], state.mu, state.sigma); } return log_post; }; // Initializing the sampler and generate a sample of size 1000 var sampler = new mcmc.AmwgSampler(params, log_post, heights); sampler.burn(1000) var samples = sampler.sample(1000) //// Below is just the code to run the sampler and //// to plot the samples. It's somewhat of a hack... //////////////////////////////////////////////////// // Setting up the plots var plot_margins = {l: 40, r: 10, b: 40, t: 40, pad: 4} var param_names = Object.keys(params); var params_to_plot = Object.keys(params); for(var i = 0; i < params_to_plot.length; i++) { var param = params_to_plot[i]; $("div#mcmc_plots_div").append('
' + '
' + '
') Plotly.plot( $("div#" + param + "_trace_div")[0], [{y: samples[param] }], {margin: plot_margins, title: "Traceplot of " + param}); Plotly.plot( $("div#" + param + "_hist_div")[0], [{x: samples[param], type: 'histogram' }], {margin: plot_margins, title: "Posterior of " + param }); }
var update_trace_plots = function() { for(var i = 0; i < params_to_plot.length; i++) { var param = params_to_plot[i] Plotly.restyle($("div#" + param + "_trace_div")[0], {y: [samples[param]]}) } } var update_histograms = function() { for(var i = 0; i < params_to_plot.length; i++) { var param = params_to_plot[i] Plotly.restyle($("div#" + param + "_hist_div")[0], {x: [samples[param]]}) } } // Below are the functions that enables starting and stopping the // sampling using the buttons clear_samples = function() { samples = sampler.sample(1) update_trace_plots(); update_histograms(); } var sample_loop_timeout_id; sample_loop = function() { var n_samples = Math.min(100, Math.ceil(samples[param_names[0]].length / 10) ); var more_samples = sampler.sample(n_samples); for(var i = 0; i < param_names.length; i++) { var param = param_names[i] Array.prototype.push.apply(samples[param], more_samples[param]) } update_trace_plots(); sample_loop_timeout_id = setTimeout(sample_loop, 1) } stop_sample_loop = function() { clearTimeout(sample_loop_timeout_id) update_trace_plots(); update_histograms(); } clear_samples() }) }) })
If this doesn’t seem to work in your browser, for some reason, then try this version of the demo.
Here is the model you just sampled from…
$$mu sim text{Normal}(0, 1000) \
sigma sim text{Uniform}(0, 1000) \
text{heights}_i sim text{Normal}(mu, sigma) ~~~ text{for} ~ i ~ text{in} 1..n$$
… and this is how it is implemented in JavaScript:
<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #666666">/*</span> The code below assumes that you have loaded the two modules of bayes.js<span style="color: #666666">:</span> <span style="color: #666666">*</span> <span style="color: #666666">-</span> mcmc.js which implements the sampler and creates the global <span style="color: #666666">*</span> object mcmc. <span style="color: #666666">*</span> <span style="color: #666666">-</span> distributions.js which implements a number of log density functions <span style="color: #666666">*</span> <span style="color: #008000; font-weight: bold">for</span> common probability distributions and that creates the global object <span style="color: #666666">*</span> ld (as <span style="color: #008000; font-weight: bold">in</span> log density). <span style="color: #666666">*/</span> <span style="color: #666666">//</span> The data var heights <span style="color: #666666">=</span> [<span style="color: #666666">183</span>, <span style="color: #666666">192</span>, <span style="color: #666666">182</span>, <span style="color: #666666">183</span>, <span style="color: #666666">177</span>, <span style="color: #666666">185</span>, <span style="color: #666666">188</span>, <span style="color: #666666">188</span>, <span style="color: #666666">182</span>, <span style="color: #666666">185</span>]; <span style="color: #666666">//</span> Parameter definitions var params <span style="color: #666666">=</span> { mu<span style="color: #666666">:</span> {type<span style="color: #666666">:</span> <span style="color: #BA2121">"real"</span>}, sigma<span style="color: #666666">:</span> {type<span style="color: #666666">:</span> <span style="color: #BA2121">"real"</span>, lower<span style="color: #666666">:</span> <span style="color: #666666">0</span>}}; <span style="color: #666666">//</span> Model definition var log_post <span style="color: #666666">=</span> <span style="color: #008000; font-weight: bold">function</span>(state, heights) { var log_post <span style="color: #666666">=</span> <span style="color: #666666">0</span>; <span style="color: #666666">//</span> Priors (here sloppy and vague...) log_post <span style="color: #666666">+=</span> ld.norm(state.mu, <span style="color: #666666">0</span>, <span style="color: #666666">1000</span>); log_post <span style="color: #666666">+=</span> ld.unif(state.sigma, <span style="color: #666666">0</span>, <span style="color: #666666">1000</span>); <span style="color: #666666">//</span> Likelihood <span style="color: #008000; font-weight: bold">for</span>(var i <span style="color: #666666">=</span> <span style="color: #666666">0</span>; i <span style="color: #666666"><</span> heights.length; i<span style="color: #666666">++</span>) { log_post <span style="color: #666666">+=</span> ld.norm(heights[i], state.mu, state.sigma); } <span style="color: #008000; font-weight: bold">return</span> log_post; }; <span style="color: #666666">//</span> Initializing the sampler, burning some draws to the MCMC gods, <span style="color: #666666">//</span> and generating a sample of size <span style="color: #666666">1000</span>. var sampler <span style="color: #666666">=</span> new mcmc.AmwgSampler(params, log_post, heights); sampler.burn(<span style="color: #666666">1000</span>); var samples <span style="color: #666666">=</span> sampler.sample(<span style="color: #666666">1000</span>); </pre></div>
I’ve implemented a JavaScript MCMC procedure for fitting a Bayesian model before, but that was just for a specific model (I also implemented a MCMC procedure in BASIC, but don’t ask me why…). The idea with bayes.js is to make it easier for me (and maybe for you) to make demos of Bayesian procedures that are easy to put online. If you would like to know more about bayes.js just head over to it’s GitHub page where you will find the code and a README file full of details. You can also check out a couple of interactive demos that I’ve made:
- A Normal distribution
- A Bernouli distribution with a Beta prior.
- A Bernouli model with a spike-n-slab prior
- An analysis of capture-recapture data
- Correlation analysis / bivariate Normal model
- Simple linear regression
- Multivariate logistic regression
- the “Pump” example from the BUGS project
-
An hierarchical model with varying slope and intercept from Gelman and Hill (2006) (This is stretching the limits of what the simple
AmwgSampler
can do…)
These demos rely on the plotly library and I haven’t tested them extensively on different platforms/browsers. You should be able to change the data and model definition on the fly (but if you change some stuff, like adding multidimensional variables, the plotting might stop working).
What’s included in bayes.js?
The two major files in bayes.js are:
-
mcmc.js - Implements a MCMC framework which can be used to fit Bayesian model with both discrete and continuous parameters. Currently the only algorithm that is implemented is a version of the adaptive Metropolis within Gibbs (
AmwgSampler
) algorithm presented by Roberts and Rosenthal (2009) . Loading this file in the browser creates the global objectmcmc
. -
distributions.js - A collection of log density functions that can be used to construct Bayesian models. Follows the naming scheme
ld.*
(for example,ld.norm
andld.pois
) and uses the same parameters as thed*
density functions in R. Loading this file in the browser creates the global objectld
.
In addition to this the whole thing is wrapped in an Rstudio project as I’ve use R and JAGS to write some tests.
FAQ
- When is a JavaScript MCMC sampler useful?
- Well, for starters, it’s not particularly useful if you want to do serious Bayesian data analysis. Then you should use a serious tool like JAGS or STAN. It could, however, be useful if you would want to put a demo of a Bayesian model online, but don’t want to / can’t run the computations on a server. It could also be useful as a part of a JavaScript application making use of Bayesian computation at some point.
- How good is the sampler that bayes.js uses?
- bayes.js implements the adaptive Metropolis within Gibbs described by Roberts and Rosenthal (2009) which is a good algorithm in that (1) it’s adaptive and works out-of-the-box without you having to set a lot of tuning parameters, (2) it can handle both continuous and discrete parameters, (3) it is easy to implement. The downside with the sampler is that (1) it only works well with a small number of parameters, (2) it’s a Gibbs sampler so it’s going to struggle with correlated parameters.
- What is “a small number of parameters”?
- Depends heavily on context but <10 is probably “small” here. But depending on the model you’re trying to run you might get away with 100+ parameters.
- How fast is it?
- Also super context dependent. On simple models it’s pretty fast, for example, fitting a standard Normal model on 1000 datapoints producing a sample of 20,000 draws takes ~0.5 s. on my computer. Also, when I’ve been playing around with different browsers I’ve seen order-of-magnitude changes in performance when changing seemingly arbitrary things. For example, inlining the definition of the Normal density in the function calculating the log posterior rather than using
ld.norm
defined in distributions.js resulted in 10x slower sampling on Firefox 37.
- Also super context dependent. On simple models it’s pretty fast, for example, fitting a standard Normal model on 1000 datapoints producing a sample of 20,000 draws takes ~0.5 s. on my computer. Also, when I’ve been playing around with different browsers I’ve seen order-of-magnitude changes in performance when changing seemingly arbitrary things. For example, inlining the definition of the Normal density in the function calculating the log posterior rather than using
References
Roberts, G. O., & Rosenthal, J. S. (2009). Examples of adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2), 349-367. pdf
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.