Studying joint effects in a regression

October 7, 2010
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

We've seen in the previous post (here)  how important the *-cartesian product to model joint effected in the regression. Consider the case of two explanatory variates, one continuous (http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart01.png, the age of the driver) and one qualitative (http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart02.png, gasoline versus diesel).

  • The additive model
Assume here that
http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart03.png
Then, given http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart04.png (the exposure, assumed to be constant) and http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart01.png
http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart05.png
Thus, there is a multplicative effect of the qualitative variate.
> reg=glm(nbre~bs(ageconducteur)+carburant+offset(exposition),
+     data=sinistres,family="poisson")
> ageD=data.frame(ageconducteur=seq(17,90),carburant="D",exposition=1)
> ageE=data.frame(ageconducteur=seq(17,90),carburant="E",exposition=1)
> yD=predict(reg,newdata=ageD,type="response")
> yE=predict(reg,newdata=ageE,type="response")
> lines(ageD$ageconducteur,yD,col="blue",lwd=2)
> lines(ageE$ageconducteur,yE,col="red",lwd=2)

On the graph below, we can see that the ratio
http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart06.png
is constant (and independent of the age http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart01.png).
> plot(ageD$ageconducteur,yD/yE)

  • The nonadditive model
In order to take into accound a more complex (non constant) interaction between the two explanatory variates, consider the following product model,
 > reg=glm(nbre~bs(ageconducteur)*carburant+offset(exposition),
+     data=sinistres,family="poisson")
> ageD=data.frame(ageconducteur=seq(17,90),carburant="D",exposition=1)
> ageE=data.frame(ageconducteur=seq(17,90),carburant="E",exposition=1)
> yD=predict(reg,newdata=ageD,type="response")
> yE=predict(reg,newdata=ageE,type="response")
> lines(ageD$ageconducteur,yD,col="blue",lwd=2)
> lines(ageE$ageconducteur,yE,col="red",lwd=2)

Here, the ratio
http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart06.png
is not constant any longer,

  • Mixing additive and nonadditive
It is also possible to consider a model in between: we believe that there is no interaction for young people (say), while there is for older ones. Assume that the beak occurs at age 50,
> reg=glm(nbre~bs(ageconducteur*(ageconducteur<50))+
+     bs(ageconducteur*(ageconducteur>=50))*carburant+offset(exposition),
+     data=sinistres,family="poisson")
> ageD=data.frame(ageconducteur=seq(17,90,by=.1),carburant="D",exposition=1)
> ageE=data.frame(ageconducteur=seq(17,90,by=.1),carburant="E",exposition=1)
> yD=predict(reg,newdata=ageD,type="response")
> yE=predict(reg,newdata=ageE,type="response")
> lines(ageD$ageconducteur,yD,col="blue",lwd=2)
> lines(ageE$ageconducteur,yE,col="red",lwd=2)

Here, the ratio
http://perso.univ-rennes1.fr/arthur.charpentier/latex/prodcart06.png
is constant for young people, while it will change for older ones,

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , ,

Comments are closed.