Defining Custom Model Priors in BMS

(This article was first published on BMS Add-ons » BMS Blog, and kindly contributed to R-bloggers)

Bayesian Model Averaging (BMA) allows for any kind of model prior distributions. While the R package BMS has built-in support for several types of commonly used priors, there may be the need for constructing a custom model prior in a particular exercise.
This tutorial is for users which already dispose of some experience in BMS. Beginners might find other tutorials vastly more helpful.

The first two of the following sections provide for a quick introduction to custom model priors in BMS, while the remaining ones discuss more advanced topics.


An illustrative example

Suppose we apply BMA on dataset attitude, but have a particular preference for the regressor learning. For this particular example, let’s put a uniform model prior on all models, but the models that include learning should have double that model prior. I.e. the model prior for models that do not include learning should be proportional to 1, while the model prior for models that include learning should be proportional to 2.

We will construct a user-defined model prior function to embody this behaviour. Note, however, that this particular setting can much more easily be implemented with a built-in model prior from BMS (see section below).
First, define the following function (details will be explained in the next section):

.mprior.test.init = function(...) { 
  # a custom model prior

  specialprior = log(2)
  
  return(list(
    mp.mode="custom-defined",
    mp.msize=NA,
    pmp=function(mdraw, ...) {
         return(ifelse(as.logical(mdraw[3]), specialprior, 0L))
      }
  ))
}

The function .mprior.test.init is a ‘creator’ function that returns a model prior object, which in turn will output the log model prior probability as required. We can directly pass the ‘creator’ function to the bms function to perform BMA:

library(BMS)
data(attitude)
firsttest = bms( attitude, mprior = .mprior.test.init )

Now look at the posterior inclusion probabilities:

coef(firsttest)
	 	 PIP   Post Mean  Post SD	
complaints	0.9995	 0.6664	  0.1328	
learning	0.5772	 0.1373	  0.1643	
advance		0.2346	-0.0346	  0.1005	
privileges	0.1788	-0.0136   0.0634	
raises		0.1643	 0.0081	  0.0827	
critical	0.154	 0.0015	  0.0545	

The result produces a higher posterior includion probability for learning than under a standard uniform model prior (as returned from bms(attitude, mprior="uniform").)


The example explained

The function bms takes functions such as the .mprior.test.init to construct a “model prior object”, uses it for sampling, and appends it to the output (augmented with some additional information). The model prior object created by the exercise above, can be looked at via

firsttest$mprior.info

The built-in model priors similarly derive from such functions. For instance the uniform model prior comes from the function .mprior.uniform.init, and the call bms(attitude, mprior="uniform") is equivalent to:

bms(attitude, mprior = .mprior.uniform.init)

bms takes the “creator function” (such as .mprior.test.init), and calls it (through some helper functions) with the following arguments:

  • mpparam: the argument mprior.size to bms (default NA)
  • K: The total number of variables (in our case, 6)
  • X.data: The argument data.frame or matrix of variables (first column is the dependent variable)

The creator function then returns a list that contains at least the following slots:

  • mp.mode: A character description of the model prior
  • mp.msize: The prior expected model size. Preferably a scalar, set it to NA if you do not know.
  • pmp: A function returning the log- prior model probability based on two parameters:
    • mdraw: A logical vector of length K detailing whcih variables are includied in the model: E.g. mdraw=c(T,T,F,F,F,F) is a model including only the first two variables.
    • ki : The number of included variables (equal to sum(mdraw)

Writing a more general model prior

The model prior above is very specific, and we could write a more general model prior for this purpose, that not only checks the user’s input, but also lets her choose the preferred variable. Moreover, it is recommended that a model prior object contains the expected prior model size and prior model size distribution for use in functions such as plotModelsize.
First, note that for each model size k, there are K over k models, of which K-1 over k-1 models contain our preferred variable i. This information can be integrated into a more general model prior ‘creator’ function:

.mprior.test2.init = function(K, mpparam, ..., specialprior=log(2)) {
   #user input checks:
  myindex=as.numeric(mpparam)
  if (is.na(myindex)|| !is.numeric(myindex) || length(myindex)==0L) myindex=1
  if ((myindex<1)||myindex>K) stop("model prior parameter is wrong")

  #pre-computing stuff:
  priorKdist= (choose(K, 0:K) + choose(K-1, 0:K -1 )) / (2^K*3/2)
  priormsize = sum( 0:K * priorKdist)

 
  #result: 
  return(list(
    mp.mode="custom-defined",
    mp.msize=priormsize,
    pmp=function(ki, mdraw, ...) {
         return(ifelse(as.logical(mdraw[3]), specialprior, 0L))
    },
    mp.Kdist= priorKdist
  ))
}

Here, the prior model distribution is put into the slot mp.Kdist which allows functions such as plotModelsize to process it for plotting etc.
The user checks are set up such that the user might choose her own variable to be ‘preferred’. (Note that here we only allow an integer index to be supplied for the ‘preferred’ variable, but one could add much more general options, of course.)
This more general model prior function allows for producing output similar to firsttest with the command:

secondtest = bms(attitude, mprior = .mprior.test2.init, mprior.size=3)

Note that in function .mprior.test2.init, we defined the parameter specialprior which will actually never be changed by the calling functions, and therefore could just as well have be defined as a constant within the function code. Nonetheless, it still could be used as an argument, as demonstrated in the last section.


Verifying results with a built-in model prior

BMS actually provides a built-in model prior which could be used equivalently to the exercise above: The custom posterior inclusion probability prior allows for defining binomial inclusion probabilities for each regressor. In order to replicate the prior above, we simply need to set the prior inclusion probability for each variable other than learning to one half (akin to the uniform model prior), and set the inclusion probability for learning to 2 /(1+2). (This result follows straightforwardly from writing the model prior up binomially.)

veriftest = bms(attitude, mprior="pip", mprior.size=c(0.5, 0.5, 2/3, rep(0.5, 3)) )

Looking at the results with coef(veriftest) shows that the results are indeed the same as under firsttest.


Additional remarks on user-defined model priors

  • Note that the slot mp.Kdist in your model prior object has length K+1: Its first elements contains prior probability for models of size zero, the second element for models of size 1 (with a single included variable) etc.
  • Keep the argument ... in the ‘creator’ and ‘pmp’ functions, as the list of arguments passed to these functions from bms might be extended in the future.
  • Similar customization options apply to the coefficient priors and MCMC Samplers in package BMS. For more details, consider the custom priors overview page..

Advanced use of a user-defined model prior

Note that bms not only accepts the model prior ‘creator function’ but also its output. These may come more handy with respect to R scoping, when one would need additional paramters to be passed to the model prior. For instance, reconsider the example .mprior.test2.init from above. We would like to use in an adjusted bms function, that takes specialprior as an additional parameter.

mybms = function(X.data, mprior="custom-defined", pref.variable=1, specialprior=2, ...) {
  my_mprior = .mprior.test2.init(K=ncol(X.data)-1, mpparam = pref.variable, specialprior=log(specialprior) )
  return( bms(X.data=X.data, mprior=my_mprior, ...) )
}

The resulting function mybsshould behave very similarly to bms, except that it does use our custom model prior, and does not allow for any other model prior.

To leave a comment for the author, please follow the link and comment on their blog: BMS Add-ons » BMS Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags:

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)