Profile Likelihood
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Consider some simulated data
> set.seed(1) > x=exp(rnorm(100))
Assume that those data are observed i.id. random variables with distribution, with . The natural idea is to consider the maximum likelihood estimator
For instance, consider some maximum likelihood estimator,
> library(MASS) > (F=fitdistr(x,"gamma")) shape rate 1.4214497 0.8619969 (0.1822570) (0.1320717) > F$estimate[1]+c(-1,1)*1.96*F$sd[1] [1] 1.064226 1.778673
Here, we have an approximated (since the maximum likelihood has an asymptotic Gaussian distribution) confidence interval for . We can use numerical optimization routine to get the maximum of the log-likelihood function
> log_lik=function(theta){ + a=theta[1] + b=theta[2] + logL=sum(log(dgamma(x,a,b))) + return(-logL) + } > optim(c(1,1),log_lik) $par [1] 1.4214116 0.8620311 $value [1] 146.5909
And we have the same value.
Now, what if we care only about , and not . The we can use profile likelihood. The idea is to solve
i.e.
or, equivalently,
> prof_log_lik=function(a){ + b=(optim(1,function(z) -sum(log(dgamma(x,a,z)))))$par + return(-sum(log(dgamma(x,a,b)))) + } > vx=seq(.5,3,length=101) > vl=-Vectorize(prof_log_lik)(vx) > plot(vx,vl,type="l") > optim(1,prof_log_lik) $par [1] 1.421094 $value [1] 146.5909
A few weeks ago, we have mentioned the likelihood ratio test, i.e.
The analogous can be obtained here, since
(the 1 comes from the fact that is a one-dimensional coefficient). The (technical) proof can be found in Suhasini Subba Rao’s notes (see also Section 4.5.2 in Antony Davison’s Statistical Models). From that property, we can easily obtain a confidende interval for
Hence, from our sample, we get the following 95% confidence interval,
> abline(v=optim(1,prof_log_lik)$par,lty=2) > abline(h=-optim(1,prof_log_lik)$value) > abline(h=-optim(1,prof_log_lik)$value-qchisq(.95,1)/2) > segments(F$estimate[1]-1.96*F$sd[1], -170,F$estimate[1]+1.96*F$sd[1],-170,lwd=3,col="blue") > borne=-optim(1,prof_log_lik)$value-qchisq(.95,1)/2 > (b1=uniroot(function(z) Vectorize(prof_log_lik)(z)+borne,c(.5,1.5))$root) [1] 1.095726 > (b2=uniroot(function(z) Vectorize(prof_log_lik)(z)+borne,c(1.25,2.5))$root) [1] 1.811809
that can be visualized below,
> segments(b1,-168,b2,-168,lwd=3,col="red")
In blue the obtained obtained using the asymptotic Gaussian property of the maximum likelihood estimator, and in red, the obtained obtained using the asymptotic chi-square distribution of the log (profile) likelihood ratio.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.