Basic Generalized Linear Modeling – Part 2: Exercises

Posted on August 1, 2018 by Hanif Kusuma in R bloggers | 0 Comments

[This article was first published on R-exercises, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this exercise, we will try to handle the model that has been over-dispersed using the quasi-Poisson model. Over-dispersion simply means that the variance is greater than the mean. It’s important because it leads to inflation in the models and increases the possibility of Type I errors. We will use a data-set on amphibian road kill (Zuur et al., 2009). It has 17 explanatory variables. We’re going to focus on nine of them using the total number of kills (TOT.N) as the response variable.

Please download the data-set here and name it “Road.” Answers to these exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Load the data-set and required package before running the exercise.

Exercise 1
Doing some plotting, we can see decreasing variability of kills with distance.

Exercise 2
Run the GLM model with distance as the explanatory variables.

Exercise 3
Add more co-variables to the model and see what’s happening by checking the model summary.

Exercise 4
Check the co-linearity using VIF’s. Set options in Base R concerning missing values.

Exercise 5
Check the summary again and set base R options. See why we do this on the previous related post exercise.

Exercise 6
Check for over-dispersion (rule of thumb, value needs to be around 1.) If it is still greater or less than 1, then we need to check diagnostic plots and re-run the GLM with another structure model.

Exercise 7
Restructure the model by throwing out the least significant terms and repeat the model until generating fewer significant terms.

Exercise 8
Check the diagnostic plots. If there are still some problems, then we might need to use other types of regression, like Negative Binomial regression. We’ll discuss it in the next exercise post.

To leave a comment for the author, please follow the link and comment on their blog: R-exercises.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Basic Generalized Linear Modeling – Part 2: Exercises

Related

Related exercise sets:

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)