# Profile Likelihood

November 16, 2015
By

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

Consider some simulated data

```> set.seed(1)
> x=exp(rnorm(100))```

Assume that those data are observed i.id. random variables with distribution, with . The natural idea is to consider the maximum likelihood estimator

For instance, consider some maximum likelihood estimator,

```> library(MASS)
> (F=fitdistr(x,"gamma"))
shape       rate
1.4214497   0.8619969
(0.1822570) (0.1320717)
> F\$estimate[1]+c(-1,1)*1.96*F\$sd[1]
[1] 1.064226 1.778673```

Here, we have an approximated (since the maximum likelihood has an asymptotic Gaussian distribution) confidence interval for . We can use numerical optimization routine to get the maximum of the log-likelihood function

```> log_lik=function(theta){
+   a=theta[1]
+   b=theta[2]
+   logL=sum(log(dgamma(x,a,b)))
+   return(-logL)
+ }

> optim(c(1,1),log_lik)
\$par
[1] 1.4214116 0.8620311

\$value
[1] 146.5909```

And we have the same value.

Now, what if we care only about , and not . The we can use profile likelihood. The idea is to solve

i.e.

or, equivalently,

```> prof_log_lik=function(a){
+   b=(optim(1,function(z) -sum(log(dgamma(x,a,z)))))\$par
+   return(-sum(log(dgamma(x,a,b))))
+ }

> vx=seq(.5,3,length=101)
> vl=-Vectorize(prof_log_lik)(vx)
> plot(vx,vl,type="l")
> optim(1,prof_log_lik)
\$par
[1] 1.421094

\$value
[1] 146.5909```

A few weeks ago, we have mentioned the likelihood ratio test, i.e.

The analogous can be obtained here, since

(the 1 comes from the fact that  is a one-dimensional coefficient). The (technical) proof can be found in Suhasini Subba Rao’s notes (see also Section 4.5.2 in Antony Davison’s Statistical Models). From that property, we can easily obtain a confidende interval for

Hence, from our sample, we get the following 95% confidence interval,

```> abline(v=optim(1,prof_log_lik)\$par,lty=2)
> abline(h=-optim(1,prof_log_lik)\$value)
> abline(h=-optim(1,prof_log_lik)\$value-qchisq(.95,1)/2)

> segments(F\$estimate[1]-1.96*F\$sd[1],
-170,F\$estimate[1]+1.96*F\$sd[1],-170,lwd=3,col="blue")
> borne=-optim(1,prof_log_lik)\$value-qchisq(.95,1)/2
> (b1=uniroot(function(z) Vectorize(prof_log_lik)(z)+borne,c(.5,1.5))\$root)
[1] 1.095726
> (b2=uniroot(function(z) Vectorize(prof_log_lik)(z)+borne,c(1.25,2.5))\$root)
[1] 1.811809```

that can be visualized below,

`> segments(b1,-168,b2,-168,lwd=3,col="red")`

In blue the obtained obtained using the asymptotic Gaussian property of the maximum likelihood estimator, and in red, the obtained obtained using the asymptotic chi-square distribution of the log (profile) likelihood ratio.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...