Profile Likelihood

November 16, 2015

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

Consider some simulated data

> set.seed(1)
> x=exp(rnorm(100))

Assume that those data are observed random variables with distribution, with . The natural idea is to consider the maximum likelihood estimator

For instance, consider some maximum likelihood estimator,

> library(MASS)
> (F=fitdistr(x,"gamma"))
     shape       rate   
  1.4214497   0.8619969 
 (0.1822570) (0.1320717)
> F$estimate[1]+c(-1,1)*1.96*F$sd[1]
[1] 1.064226 1.778673

Here, we have an approximated (since the maximum likelihood has an asymptotic Gaussian distribution) confidence interval for . We can use numerical optimization routine to get the maximum of the log-likelihood function

> log_lik=function(theta){
+   a=theta[1]
+   b=theta[2]
+   logL=sum(log(dgamma(x,a,b)))
+   return(-logL)
+ }

> optim(c(1,1),log_lik)
[1] 1.4214116 0.8620311
[1] 146.5909

And we have the same value.

Now, what if we care only about , and not . The we can use profile likelihood. The idea is to solve


or, equivalently,

> prof_log_lik=function(a){
+   b=(optim(1,function(z) -sum(log(dgamma(x,a,z)))))$par
+   return(-sum(log(dgamma(x,a,b))))
+ }

> vx=seq(.5,3,length=101)
> vl=-Vectorize(prof_log_lik)(vx)
> plot(vx,vl,type="l")
> optim(1,prof_log_lik)
[1] 1.421094
[1] 146.5909

A few weeks ago, we have mentioned the likelihood ratio test, i.e.

The analogous can be obtained here, since

(the 1 comes from the fact that  is a one-dimensional coefficient). The (technical) proof can be found in Suhasini Subba Rao’s notes (see also Section 4.5.2 in Antony Davison’s Statistical Models). From that property, we can easily obtain a confidende interval for 

Hence, from our sample, we get the following 95% confidence interval,

> abline(v=optim(1,prof_log_lik)$par,lty=2)
> abline(h=-optim(1,prof_log_lik)$value)
> abline(h=-optim(1,prof_log_lik)$value-qchisq(.95,1)/2)
> segments(F$estimate[1]-1.96*F$sd[1],
> borne=-optim(1,prof_log_lik)$value-qchisq(.95,1)/2
> (b1=uniroot(function(z) Vectorize(prof_log_lik)(z)+borne,c(.5,1.5))$root)
[1] 1.095726
> (b2=uniroot(function(z) Vectorize(prof_log_lik)(z)+borne,c(1.25,2.5))$root)
[1] 1.811809

that can be visualized below,

> segments(b1,-168,b2,-168,lwd=3,col="red")

In blue the obtained obtained using the asymptotic Gaussian property of the maximum likelihood estimator, and in red, the obtained obtained using the asymptotic chi-square distribution of the log (profile) likelihood ratio.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)