Profile Likelihood for New Jersey U.S. Senate Special Election

September 16, 2013
By

(This article was first published on Statistical Research » R, and kindly contributed to R-bloggers)

As it stands right now Cory Booker has a very good chance of winning the New Jersey Special U.S. Senate election on October 16 to replace Frank Lautenberg and fill the remainder of his term for the next 15 months.  So with the election only about a month away I took advantage of some of the data from the pre-election polls to produce a likelihood estimate based on the four most recent polls listed on www.realclearpolitics.com.

Though it may not be the perfect choice of distributions, the normal distribution will work well and we can easily use the likelihood function to produce the profile likelihood of the data.  This makes it easier to describe one parameter at a time while eliminating the nuisance parameter.  In this case we’re particularly interested in the estimate, $\mu$.

Likelihood approaches can be useful as it not only shows where the likelihood is maximized but it also shows other likely values of $\theta$.  Furthermore, likelihood approaches can have better small-sample properties over those based on asymptotic convergence (e.g. standard errors).

There are some differences between Likelihood and Bayesian/Frequentist approaches.  In this case we can produce what is referred to as a likelihood interval. However, likelihood intervals suffer from calibration problems where say a ’5% likelihood’ does not have specific meaning.  However, 5% or 10% probability does have meaning.  So really, a probability-based inference can be the better option. But in some cases probability-based conclusions can simply be ridiculous and effectively have no useful value.  In those cases probability-based inference is probably not the best choice.

In some cases the likelihood interval is the same as the confidence interval.  Particularly for interval estimates, using a normal mean case, we can choose a cutoff of 15% such that it corresponds to a 95% confidence interval in Frequentist terms.

How is the Likelihood Interval interpreted

• It can be interpreted the same way as a confidence intervals if an exact or large-sample approximation is justified.  The likelihood must be reasonably regular (i.e. the log-likelihood can be approximated by a quadratic function). Though this requirement can be quite forgiving.
• If the likelihood is certainly not regular (can be in small-sample problems or non-normal distributions) then the interval can be interpreted as a pure likelihood.

The following graphs show the estimate for Cory Booker as well as the variance of the four polls.  Given these data (as of Friday, Sep. 13) and assuming this distribution the maximum likelihood estimate shows that Cory Booker is at 55.5%. You’ll also note that since this value is the MLE this is the same value that www.realclearpolitics.com uses on their website. We can also see that the other likely values are somewhere between 50.5% and 60.5%.


## Profile Likelihoods

par(mfrow=c(1,1))

x = c(64,50,54,54)

boxplot(x, col=3, main='Boxplot of All Polls', ylab='Cory Booker Percent')

n = length(x)

np = 500
sx = sqrt(var(x))

mu.theta = seq(mean(x)-8,mean(x)+8,len=np)
sigma.theta = seq(sx/2.75,sx*2.75,len=np)

logLikeFun < - function(mu, x){
## However, only the middle portion of this equation is necessary
a = -n/2 * log( 2 * pi ) - n/2 * log( sum((mu-x)^2)/n ) - 1/(2 * sum((mu-x)^2)/n )*sum((mu-x)^2)
a
}
logLikeFunSigma = function(mu,sigma){
a = -n/2*log(sigma^2) -
(sum(x^2) - 2*mu*sum(x) + n*mu^2)/(2*sigma^2)
-a
}
li = function(th,like,alpha=0.15){
that = mean(th[like==max(like)])
lowth = th[th < that]
lowlik = like[th < that]
if (length(lowth) < 2){
lowval = min(th)
}
if (length(lowth) > 1){
lowval = approx(lowlik,lowth,xout=alpha)$y } upth = th[th > that] if (length(upth) < 2 ){ return(c(lowval,max(th))) } if (length(upth) > 1){ uplik = like[th > that] upval = approx(uplik,upth,xout=alpha)$y
return(c(lowval,upval))
}
}
## Joint Likelihood
ll.joint = outer(mu.theta, sigma.theta,'logLikeFunSigma')
like.joint = exp(min(ll.joint)-ll.joint)
contour(mu.theta, sigma.theta^2, like.joint,
xlab=expression(mu), ylab=expression(sigma^2),
level=c(.1,.3,.5,.7,.9), lwd=2,
main="Joint Likelihood Contour of\nCory Booker Estimate and Variance of Polls")

## Profile Likelihood for mean mu
log.like = lapply(mu.theta, logLikeFun, x)
log.like = unlist(log.like)
like = exp(log.like - max(log.like)) ## normalize the value to 1.0

## Profile Likelihood (mu, sigma^2=sigma.hat^2)
se = sqrt(var(x)*(n-1)/n)/sqrt(n)
estimate.like = dnorm(mean(x), mean=mu.theta, sd=se)
estimate.like = estimate.like/max(estimate.like)

## Profile Likelihood (mu, sigma=1)
log.like.sigma = logLikeFunSigma(mu.theta, sigma=1)
sigma.like = exp(min(log.like.sigma) - log.like.sigma)

## Profile LIkelihohod sigma^2

plot(mu.theta, like, type='n', main=expression(paste('Profile Likelihood of ',mu,' for Cory Booker')),
ylab='Likelihood', xlab=expression(mu), ylim=c(0,1)) ## create blank graph of the correct size
lines(mu.theta, like,lty=1,lwd=2, col=2)
lines(mu.theta, estimate.like, lty=2, lwd=2, col=3)
text(mean(x)+.75,0,format(mean(x),digits=3) )
abline(v=c(mean(x),li(mu.theta, like),li(mu.theta, estimate.like)), lty=1, lwd=c(2.5,1,1,1,1), col=c(1,2,2,3,3))
abline(h=.15)
legend('topright',c(expression(paste('L(',mu,')')),expression(paste('L(',mu,',',sigma^2,'=',hat(sigma)^2,') ' ))), lwd=2, col=c(2,3), bg='#FFFFFF')



To leave a comment for the author, please follow the link and comment on his blog: Statistical Research » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...