In this exercise, we will continue to solve problems from the last exercise about GLM here. Therefore, the exercise number will start at 9. Please make sure you read and follow the previous exercise before you continue practicing.
In the last exercise, we knew that there was over-dispersion over the model. So, we tried to use Quasi-Poisson regression, along with step-wise variable selection algorithms. Please note, here we assumed there is no influence from the background theory or knowledge behind the data. Obviously, there is no such thing in the real world, but we just use this step as an exercise in general.
Answers to these exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Load the data-set and required package before running the exercise.
Load the package called “MASS” to execute the negative binomial model. Run the package; consider all the explanatory variables.
Check the summary of the model.
Set options in base R, considering missing values.
The previous exercise gave insight that variables 1,3,4,6 or 1,4,6 produce the best model performance. Therefore, refit the model using those variables.
Check the diagnostic plot and generate a conclusion based on if the model gives the best performance.