# Defining Custom Model Priors in BMS

Bayesian Model Averaging (BMA) allows for any kind of model prior distributions. While the R package BMS has built-in support for several types of commonly used priors, there may be the need for constructing a custom model prior in a particular exercise.
This tutorial is for users which already dispose of some experience in BMS. Beginners might find other tutorials vastly more helpful.

The first two of the following sections provide for a quick introduction to custom model priors in BMS, while the remaining ones discuss more advanced topics.

### An illustrative example

Suppose we apply BMA on dataset `attitude`, but have a particular preference for the regressor `learning`. For this particular example, let’s put a uniform model prior on all models, but the models that include `learning` should have double that model prior. I.e. the model prior for models that do not include `learning` should be proportional to 1, while the model prior for models that include `learning` should be proportional to 2.

We will construct a user-defined model prior function to embody this behaviour. Note, however, that this particular setting can much more easily be implemented with a built-in model prior from BMS (see section below).
First, define the following function (details will be explained in the next section):

```.mprior.test.init = function(...) {
# a custom model prior

specialprior = log(2)

return(list(
mp.mode="custom-defined",
mp.msize=NA,
pmp=function(mdraw, ...) {
return(ifelse(as.logical(mdraw), specialprior, 0L))
}
))
}
```

The function `.mprior.test.init` is a ‘creator’ function that returns a model prior object, which in turn will output the log model prior probability as required. We can directly pass the ‘creator’ function to the `bms` function to perform BMA:

```library(BMS)
data(attitude)
firsttest = bms( attitude, mprior = .mprior.test.init )
```

Now look at the posterior inclusion probabilities:

```coef(firsttest)
PIP   Post Mean  Post SD
complaints	0.9995	 0.6664	  0.1328
learning	0.5772	 0.1373	  0.1643
privileges	0.1788	-0.0136   0.0634
raises		0.1643	 0.0081	  0.0827
critical	0.154	 0.0015	  0.0545
```

The result produces a higher posterior includion probability for `learning` than under a standard uniform model prior (as returned from `bms(attitude, mprior="uniform")`.)

### The example explained

The function `bms` takes functions such as the `.mprior.test.init` to construct a “model prior object”, uses it for sampling, and appends it to the output (augmented with some additional information). The model prior object created by the exercise above, can be looked at via

```firsttest\$mprior.info
```

The built-in model priors similarly derive from such functions. For instance the uniform model prior comes from the function `.mprior.uniform.init`, and the call `bms(attitude, mprior="uniform")` is equivalent to:

```bms(attitude, mprior = .mprior.uniform.init)
```

bms takes the “creator function” (such as `.mprior.test.init`), and calls it (through some helper functions) with the following arguments:

• mpparam: the argument `mprior.size` to `bms` (default NA)
• K: The total number of variables (in our case, 6)
• X.data: The argument data.frame or matrix of variables (first column is the dependent variable)

The creator function then returns a list that contains at least the following slots:

• mp.mode: A character description of the model prior
• mp.msize: The prior expected model size. Preferably a scalar, set it to NA if you do not know.
• pmp: A function returning the log- prior model probability based on two parameters:
• mdraw: A logical vector of length K detailing whcih variables are includied in the model: E.g. `mdraw=c(T,T,F,F,F,F)` is a model including only the first two variables.
• ki : The number of included variables (equal to `sum(mdraw)`

### Writing a more general model prior

The model prior above is very specific, and we could write a more general model prior for this purpose, that not only checks the user’s input, but also lets her choose the preferred variable. Moreover, it is recommended that a model prior object contains the expected prior model size and prior model size distribution for use in functions such as `plotModelsize`.
First, note that for each model size k, there are K over k models, of which K-1 over k-1 models contain our preferred variable i. This information can be integrated into a more general model prior ‘creator’ function:

```.mprior.test2.init = function(K, mpparam, ..., specialprior=log(2)) {
#user input checks:
myindex=as.numeric(mpparam)
if (is.na(myindex)|| !is.numeric(myindex) || length(myindex)==0L) myindex=1
if ((myindex<1)||myindex>K) stop("model prior parameter is wrong")

#pre-computing stuff:
priorKdist= (choose(K, 0:K) + choose(K-1, 0:K -1 )) / (2^K*3/2)
priormsize = sum( 0:K * priorKdist)

#result:
return(list(
mp.mode="custom-defined",
mp.msize=priormsize,
pmp=function(ki, mdraw, ...) {
return(ifelse(as.logical(mdraw), specialprior, 0L))
},
mp.Kdist= priorKdist
))
}
```

Here, the prior model distribution is put into the slot mp.Kdist which allows functions such as `plotModelsize` to process it for plotting etc.
The user checks are set up such that the user might choose her own variable to be ‘preferred’. (Note that here we only allow an integer index to be supplied for the ‘preferred’ variable, but one could add much more general options, of course.)
This more general model prior function allows for producing output similar to `firsttest` with the command:

```secondtest = bms(attitude, mprior = .mprior.test2.init, mprior.size=3)
```

Note that in function `.mprior.test2.init`, we defined the parameter `specialprior` which will actually never be changed by the calling functions, and therefore could just as well have be defined as a constant within the function code. Nonetheless, it still could be used as an argument, as demonstrated in the last section.

### Verifying results with a built-in model prior

BMS actually provides a built-in model prior which could be used equivalently to the exercise above: The custom posterior inclusion probability prior allows for defining binomial inclusion probabilities for each regressor. In order to replicate the prior above, we simply need to set the prior inclusion probability for each variable other than `learning` to one half (akin to the uniform model prior), and set the inclusion probability for `learning` to `2 /(1+2)`. (This result follows straightforwardly from writing the model prior up binomially.)

```veriftest = bms(attitude, mprior="pip", mprior.size=c(0.5, 0.5, 2/3, rep(0.5, 3)) )
```

Looking at the results with `coef(veriftest)` shows that the results are indeed the same as under `firsttest`.

### Additional remarks on user-defined model priors

• Note that the slot `mp.Kdist` in your model prior object has length K+1: Its first elements contains prior probability for models of size zero, the second element for models of size 1 (with a single included variable) etc.
• Keep the argument `...` in the ‘creator’ and ‘pmp’ functions, as the list of arguments passed to these functions from bms might be extended in the future.
• Similar customization options apply to the coefficient priors and MCMC Samplers in package BMS. For more details, consider the custom priors overview page..

#### Advanced use of a user-defined model prior

Note that bms not only accepts the model prior ‘creator function’ but also its output. These may come more handy with respect to R scoping, when one would need additional paramters to be passed to the model prior. For instance, reconsider the example `.mprior.test2.init` from above. We would like to use in an adjusted bms function, that takes `specialprior` as an additional parameter.

```mybms = function(X.data, mprior="custom-defined", pref.variable=1, specialprior=2, ...) {
my_mprior = .mprior.test2.init(K=ncol(X.data)-1, mpparam = pref.variable, specialprior=log(specialprior) )
return( bms(X.data=X.data, mprior=my_mprior, ...) )
}
```

The resulting function `mybs`should behave very similarly to `bms`, except that it does use our custom model prior, and does not allow for any other model prior.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: