(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)
Consider a (simple) Poisson regression
. Given a sample
where
, the goal is to derive a 95%
confidence interval for
given
, where
is the
prediction. Hence, we want to derive a confidence interval for the
prediction, not the potential observation, i.e. the dot on the graph below> r=glm(dist~speed,data=cars,family=poisson) > P=predict(r,type="response", + newdata=data.frame(speed=seq(-1,35,by=.2))) > plot(cars,xlim=c(0,31),ylim=c(0,170)) > abline(v=30,lty=2) > lines(seq(-1,35,by=.2),P,lwd=2,col="red") > P0=predict(r,type="response",se.fit=TRUE, + newdata=data.frame(speed=30)) > points(30,P1$fit,pch=4,lwd=3)

Let
denote the maximum likelihood estimator of
. Then

where
is Fisher information of
(from standard maximum likelihood theory). Recall that

where computation of those values is based on the following calculations

In the case of the log-Poisson regression

Let us get back to our initial problem.
- confidence interval for the linear combination
is to get a confidence interval for
(by taking exponential values of bounds, since the exponential is a monotone function). Asymptotically, we know that

will be based on
, obtained by plugging estimators of the parameters. Then, since
as an asymptotic multivariate distribution,
any linear combination of the parameters will also be normal, i.e.
has a normal distribution, centered on
, with variance
where
is the variance of
.
All those quantities can be easily computed. First, we can get the variance of the estimators> i1=sum(predict(reg,type="response")) > i2=sum(cars$speed*predict(reg,type="response")) > i3=sum(cars$speed^2*predict(reg,type="response")) > I=matrix(c(i1,i2,i2,i3),2,2) > V=solve(I)
> summary(reg)$cov.unscaled (Intercept) speed (Intercept) 0.0066870446 -3.474479e-04 speed -0.0003474479 1.940302e-05 > V [,1] [,2] [1,] 0.0066871228 -3.474515e-04 [2,] -0.0003474515 1.940318e-05Based on those values, it is easy to derive the standard deviation for the linear combination,
> x=30 > P2=predict(r,type="link",se.fit=TRUE,
+ newdata=data.frame(speed=x)) > P2 $fit 1 5.046034 $se.fit [1] 0.05747075 $residual.scale [1] 1 > > sqrt(V[1,1]+2*x*V[2,1]+x^2*V[2,2]) [1] 0.05747084 > sqrt(t(c(1,x))%*%V%*%c(1,x)) [,1] [1,] 0.05747084
Based on that technique, confidence intervals are not centered on the prediction, but who cares ?

- delta method
> estmean=t(c(1,x))%*%coef(reg) > var=t(c(1,x))%*%summary(reg)$cov.unscaled%*%c(1,x) > library(msm) > deltamethod (~ exp(x1), estmean, var) [1] 8.931232 > P1=predict(r,type="response",se.fit=TRUE,
+ newdata=data.frame(speed=30)) > P1 $fit 1 155.4048 $se.fit 1 8.931232 $residual.scale [1] 1

Note that those quantities - obtained with two different approaches - are rather close here
> exp(P2$fit-1.96*P2$se.fit) 1 138.8495 > P1$fit-1.96*P1$se.fit 1 137.8996 > exp(P2$fit+1.96*P2$se.fit) 1 173.9341 > P1$fit+1.96*P1$se.fit 1 172.9101
- bootstrap techniques

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).