Fitting count and zeroinflated count GLMMs with mgcv
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.
A couple of days ago, Mollie Brooks and coauthors posted a preprint on BioRχiv illustrating the use of the glmmTMB R package for fitting zeroinflated GLMMs (Brooks et al., 2017). In the paper, glmmTMB is compared with several other GLMMfitting packages. mgcv has recently gained the ability to fit a wider range of families beyond the exponential family of distributions, including zeroinflated Poisson models. mgcv can also fit simple GLMMs through a spline equivalent of a Gaussian random effect. So, whilst I was waiting on some Bayesian GAMs to finish sampling, I decided to see how mgcv compared against glmmTMB on the two examples used in the paper.
For this post I’ll be using a couple of packages beyond glmmTMB and mgcv; make sure you have ggplot2 and ggstance installed if you wish to run through the code below.
There are several ways in which mgcv allows GLMMs to be fitted, but the way that interests me here is via gam()
and the random effect spline basis. Penalised splines of the type provided in mgcv can also be represented in mixed model form, such that GAMs can also be fitted using mixed effect modelling software. The general idea is that the spline is decomposed into two parts:
 the perfectly smooth parts of the basis, namely those functions, including constant and linear functions, in the penalty null space of the spline. These are added to the fixed effects model matrix, whilst,
 the remaining wiggly parts of the basis are treated as random effects.
Given this duality between splines and random effects, you can reverse the idea and create a spline basis that is the equivalent of a simple Gaussian i.i.d random effect, such that you can fit a GLMM or GAMM using GAM software like mgcv. mgcv has the re
basis for this, and I’ll exploit that to fit the zeroinflated GLMMs to the two examples.
In Brooks et al. (2017), two example data sets are used;

Salamanders
— Seven combinations of different salamander species and lifestages were repeatedly sampled four times at 23 sites in Applachian streams (Price et al., 2016). Some of the streams were impacted by mountaintop removal and valley filling from coal mining. The data are available from Price et al. (2015), as well as the glmmTMB package. 
Owls
— the second example is a wellstudied one in mixed modelling papers and textbooks (Zuur et al., 2009, Bolker et al. (2013)), and relates to the begging behaviour of owl nestlings. The data were originally reported in Roulin and Bersier (2007).
Salamanders
Brooks et al. (2017) fit several count models to the Salamander
data set, including standard Poisson GLMMs, negative binomial GLMMs, with () estimated and modelled via a linear predictor, as well as zeroinflated Poisson (ZIP) and zeroinflated negative binomial (ZINB) models. Of these, gam()
can currently fit all but the negative binomial with () modelled via a linear predictor and the ZINB models.
The best fitting model of those presented was a negative binomial model, whilst Brooks et al. (2017) also illustrate how to generate fitted values from the ZIP. Rather than go through fitting all of the Brooks et al. (2017) models, I restrict fitting here to these two models. A gist with code to fit all the models that gam()
is capable of is available on Github. I have named the models similarly to Brooks et al. (2017) to facilitate comparison.
As glmmTMB()
is currently only capable of fitting models using maximum likelihood, not REML, I use the Laplace approximate maximum likelihood estimation method for gam()
. The new nb
family in mgcv is for the negative binomial distribution with the (fixed) dispersion parameter () estimated as a model parameter, in the same way that MASS::glm.nb()
and lme4::glmer.nb()
models do.
In the gam()
model, the random effect is specified using the standard s()
smooth function with the “re”
basis selected. The named variable, here site
, should be stored as a factor in the data object to avoid problems.
The figure below compares the coefficient estimates returned by glmmTMB()
and gam()
; they are very similar, which is encouraging.
The values (posterior modes, or means) for the site
random effect can also be compared
As the figure above shows, these too are essentially equivalent for the two fits.
The summary()
output for the glmmTMB()
model conveniently provides some additional useful information, in the context of GLMMs most notably the estimated variances (or standard deviations) of the random effect terms. As gam()
wasn’t designed with GLMMs specifically in mind, the same information is not provided in the the summary()
method for gam()
model fits. However, Simon Wood has provided the gam.vcomp()
function, which can be used to return the variance components of the model in a way that allows comparison with other mixedmodels specific software.
Now the gam()
version, conveniently with a confidence interval
One further analysis that Brooks et al. (2017) do with the Salamander
data (in their Appendix B) is to demonstrate how to generate and plot fitted values from the model. To do this, the analyst needs to consider whether to and how to marginalise over or condition on the random effects. The Appendix has some details on this more generally (via a linked reference) and morespecific pointers on how to go about doing this with glmmTMB()
models. In the next few code chunks I will illustrate how to achieve the result from their section Alternative prediction method, where the aim is to predict at the population mode by setting the random effect component to 0. To illustrate this, Brooks et al. (2017) use the more complex ZIP model with linear predictors for both the mean and the zeroinflation components of the model. I fit those models first
The glmmTMB()
model has the zeroinflation linear predictor specified via the ziformula
argument (abbreviated to zi
above). With gam()
however, multiple linear predictors are specified via a list of formula objects, only the first of which has a response (lefthandside). The first formula, with the response, is for the Poisson mean, whilst the second is for the zeroinflation component. Note also that we use the special ziplss()
family and that now the model is being estimated using REML, because that is the only option available for these models, which Simon Wood calls general smooth models (Wood et al., 2016). Do note that there is (as of writing) no link
argument(s) for the ziplss()
family. This is due to the way the model is parameterised internally in the software. This will require us to pay particular attention to the implementation shortly.
To recreate part of Figure B.3 in Appendix B (Brooks et al., 2017), the code below predicts from the fitted gam()
model for all combinations of the factors mined
and spp
. Notice how we have to specify a site
in the prediction data, otherwise predict()
will throw a tantrum. To set the random effect for site
to zero, use the exclude
argument. To exclude (i.e. set to zero) any model term, you supply a character vector or list of terms to exclude
. For smooth terms, these must be named as they appear in summary(model)
, hence the use of “s(site)”
. The final step is to call predict()
with type = “link”
. This will return a two column matrix (or a list of twocolumn matrices if se.fit = TRUE
is also used).
The first column is the predicted value of the response from the Poisson part of the model on the scale of the linear predictor (the log scale). The second column is the predicted value from the zeroinflation component and is on the complementary loglog scale. Both of these need to be back transformed to the respective response scales and then multiplied together. To do this for the zeroinflation part, I copied the code from the base R binomial()
family with the appropriate link specified. The second line of code below adds the predicted values for each combination of mined
and spp
to the prediction data object. Note that each component is backtransformed using the appropriate link, and then multiplied together.
A plot of the predicted values is then easily produced
Because of the way the gam()
model is implemented, I could also have computed the Bayesian credible intervals using the Bayesian covariance matrix of the model parameters via the se.fit
argument to predict()
. I’ll perhaps save that for another day…
Owls
The Owls
data are also available in the glmmTMB package, which I load and then do a little processing of the data to simplify the name of the response variable and to mean centre the ArrivalTime
covariate.
Two ZIP models are considered
 a ZIP with constant zeroinflation (an interceptonly model for the zeroinflation), and

a ZIP with complex zeroinflation, where one covariate and a random effect for
Nest
are included in the linear predictor of the zeroinflation part of the model.
The constant zeroinflation models are fitted using the ziformula
argument for glmmTMB
with family = poisson
, whilst for gam()
we use a list of two formula objects, the second for the ZI linear predictor, and the ziplss
family. Note that this model could also be fitted using the Zip()
family in mgcv but that employs a different, simpler fitting algorithm so to facilitate comparison with the more complex model I use ziplss()
instead.
Again note that these models are not estimated in the same way; glmmTMB()
estimates the model parameters using maximum likelihood, whilst only REML estimation is available for the ziplss()
family with gam()
. In gam()
, the interceptonly ZI linear predictor is specified with the formula ~ 1
.
To compare the estimates of the model coefficients I wrote a little function to extract the estimated values and their standard errors from the two model objects
Passing each of the models to createCoeftab()
results in a tidy data frame suitable for plotting with ggplot()
.
As can be seen in the figure, the estimates from the two functions are quite similar.
The morecomplex models with covariates in the ZI linear predictor are fitted next
As before, we gather the model coefficients
and plot them
and likewise as before, the estimates of the fixed effect terms are very similar indeed.
Conclusions
The comparisons shown above show that mgcv::gam()
and glmmTMB()
produce very similar estimates for the two models. And some crude timings showed that gam()
was 20–40% faster than glmmTMB()
at fitting the examples discussed in the paper. So all is roses, right!? Who needs glmmTMB()
?
That would however, be totally the wrong message to take from this comparison. Most notably, and something that isn’t surfaced in these simple examples is that gam()
is limited in the complexity of the random effects it can efficiently represent in models:
 it can’t do correlated random effects for random slopes and intercepts models (as far as I can tell anyway), and, and this is probably the deal breaker,

model fitting with
gam()
gets bogged down quickly if the number of levels in a random effect gets large. Jamie Ashander did some quick tests with a larger version of the Salamander with 100s ofsite
s andglmmTMB()
totally dominatedgam()
.
And that’s fine; gam()
was not designed to fit GLMMs — there are no less than three implementations by Simon Wood alone of functions to fit GAMs with complex random effects in mixed model software (gamm()
to fit with lme()
, gamm4()
to fit using lmer()
and glmer()
, and jagam()
in mgcv to fit via JAGS). Furthermore, glmmTMB()
is currently more flexible in the range of models that it can fit than any these implementations, except for JAGS, because the nb
, Zip
, and ziplss
families only work with gam()
.
What the above comparison illustrates, however, is that if you either don’t have complex or many random effects or that you don’t mind running models overnight, gam()
is a good option for fitting GLMMs. Plus you have the advantage of estimating smooth functions of covariates, which is one area where glmmTMB()
is currently very lacking compared to gam()
.
That said, it should be possible to emulate what PaulChristian Bürkner has done in his brms package (and similar implementations by Simon Wood in gamm4()
) to use mgcv to set up the correct model matrices for the random effect representation of splines which can then be fitted using glmmTMB()
.
Finally, this was a fun exercise to replicate the analyses in Brooks et al. (2017), motivated by a desire to understand what mgcv and gam()
are doing with these random effect splines. It wasn’t intended as a prizefight between two title contenders — hopefully this writeup didn’t come across that way. I also learned a lot more about glmmTMB, which is shaping up nicely and looks like it’ll have a place in my modelling toolbox.
References
Bolker, B. M., Gardner, B., Maunder, M., Berg, C. W., Brooks, M., Comita, L., et al. (2013). Strategies for fitting nonlinear ecological models in r, AD model builder, and BUGS. Methods in ecology and evolution / British Ecological Society 4, 501–512. doi:10.1111/2041210X.12044.
Brooks, M. E., Kristensen, K., Benthem, K. J. van, Magnusson, A., Berg, C. W., Nielsen, A., et al. (2017). Modeling ZeroInflated count data with glmmTMB. bioRxiv, 132753. doi:10.1101/132753.
Price, S. J., Muncy, B. L., Bonner, S. J., Drayer, A. N., and Barton, C. D. (2015). Data from: Effects of mountaintop removal mining and valley filling on the occupancy and abundance of stream salamanders. doi:10.5061/dryad.5m8f6.
Price, S. J., Muncy, B. L., Bonner, S. J., Drayer, A. N., and Barton, C. D. (2016). Effects of mountaintop removal mining and valley filling on the occupancy and abundance of stream salamanders. The Journal of applied ecology 53, 459–468. doi:10.1111/13652664.12585.
Roulin, A., and Bersier, L.F. (2007). Nestling barn owls beg more intensely in the presence of their mother than in the presence of their father. Animal behaviour 74, 1099–1106.
Wood, S. N., Pya, N., and Säfken, B. (2016). Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association 111, 1548–1563. doi:10.1080/01621459.2016.1180986.
Zuur, A. F., Ieno, E. N., Walker, N., Saveliev, A. A., and Smith, G. M. (2009). Mixed effects models and extensions in ecology with r: Springer New York doi:10.1007/9780387874586.
Rbloggers.com offers daily email updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/datascience job.
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.