The little non-informative prior that could (be informative)

November 26, 2013
By

(This article was first published on Statistical Reflections of a Medical Doctor » R, and kindly contributed to R-bloggers)

Christian Robert reviewed on line a paper that was critical of non-informative priors. Among the points that were discussed by him and other contributors (e.g. Keith O’Rourke), was the issue of induced priors, i.e. priors which arise from a transformation of original parameters, or of observables. I found this exchange interesting because I did something similar when revisiting an old research project that had been collecting digital dust in my hard disk. The specific problem had to do with analysis of a biomarker that was measured with a qualitative technique yielding a binary classification of measurements as present or absent, in two experimental conditions (call them A and B). Ignoring some technical aspects of the study design, the goal was to calculate the odds ratio of the biomarker being expressed in condition B v.s A (the reference state signifying absence of disease).

When writing the programs for the analysis, I defaulted to the N(0.0,1.0E-6) prior that epitomizes non-informativeness in BUGS. However, one of my co-authors asked the “What the @#$%& does this prior mean?” question. And then we stopped … and reflected on what we were about to do. You see, before the experiment started we had absolutely no prior information about the behaviour of the biomarker in either experimental state so that we did not want to commit one way or another. In other words, Laplace’s original uniform (or Beta(1,1)) prior would have been reasonable if the expression data for  A and B were to be analyzed separately. However, we wanted to analyze the data with a logistic regression model, so was the ubiquitous N(0.0,1.0E-6) the prior we were after?

The answer is a loud NO! According to Wikipedia, the mother of all knowledge, the logistic transformation of a uniform variate is the logistic distribution with location of zero and scale of 1. Hence, the prior on the intercept of the logistic regression (interpretable as the odds of the biomarker being expressed in state A) had to be a Logistic(0,1).

UAsLogistic

Surprisingly the Odds Ratio of B v.s. A was found (after trial and error and method of moments considerations) to be very well approximated by a 1:1 mixture of a logistic and a Gaussian which clearly departs from the N(0.0,1.0-6) prior we (almost) used:

ORs

Bottom line: Even informative (in the BUGS sense!) priors can be pretty non-informative in some intuitively appropriate parameterization. Conversely, one could start with a non-informative prior in a parameterization that is easier to reason about and look for an induced prior (using analytic considerations or even simulations) to convert it to a parameterization that is more appropriate to the analytic plan at hand.

(R code for the plots and simulations is given below)

## approximating uniforms
logit<-function(x) log(x/(1-x))
set.seed(1234)
N<-10000000
s<-runif(N,0,1);
s2<-runif(N,0,1);
y<-logit(s)
y2<-logit(s2)
m<-mean(y)
s<-sd(y)
x<-seq(-10,10,.1)
## logistic is logit of a uniform
hist(y,prob=TRUE,breaks=50,main="intercept",
     xlab="logit(A)")
lines(x,dnorm(x,m,s),col="red")
lines(x,dlogis(x,0,1),col="blue")
legend(-15,0.20,legend=c("Normal(0,1)",
      "Logistic(0,1)"),lty=1,col=c("blue","red") )

## approximating the difference of two uniforms
hist(y-y2,prob=TRUE,ylim=c(0,.25),breaks=200,
     xlim=c(-10,10),main="OR between two U(0,1)",
     xlab="logit(B)-logit(A)")
## logistic approximation
lines(x,dlogis(x,0,sqrt(2)),col="blue",lwd=2)
## normal
lines(x,dnorm(x,0,(pi)*sqrt(2/3)),col="red",lwd=2)
## mixture of a logistic and a normal approximation
lines(x,0.5*(dlogis(x,0,sqrt(2))+
     dnorm(x,0,(pi)*sqrt(2/3))),col="green",lwd=2)
## legends
NL<-expression(paste("Normal(0,",pi*sqrt(2/3),")"))
LL<-expression(paste("Logistic(0,",sqrt(2),")"))
ML<-expression(paste("0.5 Normal(0,",pi*sqrt(2/3),")+0.5 Logistic(0,",sqrt(2),")"))
legend(-6.5,0.25,legend=c(NL,LL,ML),
       lty=1,col=c("blue","red","green") )

## does it extend to more general cases?
m1<--2;m2<-2;s1<-1;s2<-2.5;
l1<-rlogis(N,m1,s1)
l2<-rlogis(N,m2,s2)
d<-l1-l2
hist(d,prob=TRUE,ylim=c(0,0.25),breaks=200)
plot(density(d))
lines(x,dlogis(x,m1-m2,sqrt(s1^2+s2^2)),col="green",lwd=2)
lines(x,dnorm(x,m1-m2,pi*sqrt((s1^2+s2^2)/3)),col="red",lwd=2)
lines(x,0.5*(dnorm(x,m1-m2,pi*sqrt((s1^2+s2^2)/3))+
             dlogis(x,m1-m2,sqrt(s1^2+s2^2))),col="blue",lwd=2)


Edit (29/11/2013):
Updated the first image due to an accidental reversal of the distribution labels

To leave a comment for the author, please follow the link and comment on their blog: Statistical Reflections of a Medical Doctor » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)