More rainfall calculations – REML

August 4, 2013
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

I wanted to have a look at various REML methods for a long time. The rainfall data seemed a nice example. On top of that, FreshBiostats had a blog post 'Mixed Models in R: lme4, nlme, or both?'. So lme4 it is.

The data  

As described before; the data are from KNMI; ROYAL NETHERLANDS METEOROLOGICAL INSTITUTE. The actual page for the data is http://www.knmi.nl/klimatologie/monv/reeksen/select_rr.html.  The data selected were from 6 locations; De Bilt, Groningen, Heerde, Kerkwerve, Ter Apel and Winterswijk (Sibinkweg). The blog starts with data entered in a data.frame named all which was derived a few weeks ago.

Analysis

A simple analysis would ignore years. The data are plotted below, the figure indicates a clear effect.  
sel1 <- all[(all$year <1916 | all$year>2003) 
        & all$Month=='Jan' ,]
sel1$decade <- factor(c('1906-1915','2004-2013')[1+(sel1$year>1950)]) 
sel1$Rain <- as.numeric(sel1$RD>0)

ag1 <- aggregate(sel1$Rain,list(Year=sel1$Year,
        location=sel1$location,
        decade=sel1$decade),FUN=sum)
ag1$Year <- factor(ag1$Year)
p <- ggplot(ag1, aes(decade, x/31))
p + geom_boxplot(notch=TRUE)

Analysis part 2

The problem appears once years are plotted; It seems more of the years in the first decade have a low proportion days with rain, but some of the last decade also have less days with rain.
p <- ggplot(ag1, aes(decade, x/31,col=Year))
p + geom_boxplot()
As written lme4 was my package of choice. This does give a significant effect, but it only at 3%.
library(lme4)
l1 <- glmer(cbind(31-x,x) ~  (1|Year)  ,data=ag1,family=binomial)
l1
Generalized linear mixed model fit by the Laplace approximation 
Formula: cbind(31 - x, x) ~ (1 | Year) 
   Data: ag1 
   AIC   BIC logLik deviance
 231.3 236.9 -113.7    227.3
Random effects:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.38069  0.617   
Number of obs: 120, groups: Year, 20

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)   
(Intercept)  -0.3950     0.1423  -2.775  0.00552 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

l2 <- glmer(cbind(31-x,x) ~  decade + (1|Year) ,data=ag1,family=binomial)
l2
Generalized linear mixed model fit by the Laplace approximation 
Formula: cbind(31 - x, x) ~ decade + (1 | Year) 
   Data: ag1 
   AIC   BIC logLik deviance
 228.5 236.9 -111.3    222.5
Random effects:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.29496  0.5431  
Number of obs: 120, groups: Year, 20

Fixed effects:
                Estimate Std. Error z value Pr(>|z|)  
(Intercept)      -0.1019     0.1783  -0.572   0.5674  
decade2004-2013  -0.5862     0.2528  -2.319   0.0204 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)

dc2004-2013 -0.705
anova(l1,l2)
Data: ag1
Models:
l1: cbind(31 - x, x) ~ (1 | Year)
l2: cbind(31 - x, x) ~ decade + (1 | Year)
   Df    AIC    BIC  logLik Chisq Chi Df Pr(>Chisq)  
l1  2 231.31 236.88 -113.65                          
l2  3 228.53 236.90 -111.27 4.772      1    0.02893 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

To leave a comment for the author, please follow the link and comment on his blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.