(This article was first published on

I wanted to have a look at various REML methods for a long time. The rainfall data seemed a nice example. On top of that, FreshBiostats had a blog post 'Mixed Models in R: lme4, nlme, or both?'. So lme4 it is.**Wiekvoet**, and kindly contributed to R-bloggers)### The data

As described before; the data are from KNMI; ROYAL NETHERLANDS METEOROLOGICAL INSTITUTE. The actual page for the data is http://www.knmi.nl/klimatologie/monv/reeksen/select_rr.html. The data selected were from 6 locations; De Bilt, Groningen, Heerde, Kerkwerve, Ter Apel and Winterswijk (Sibinkweg). The blog starts with data entered in a data.frame named all which was derived a few weeks ago.

### Analysis

A simple analysis would ignore years. The data are plotted below, the figure indicates a clear effect.

sel1 <- all[(all$year <1916 | all$year>2003) & all$Month=='Jan' ,]

sel1$decade <- factor(c('1906-1915','2004-2013')[1+(sel1$year>1950)])

sel1$Rain <- as.numeric(sel1$RD>0)

ag1 <- aggregate(sel1$Rain,list(Year=sel1$Year,

location=sel1$location,

decade=sel1$decade),FUN=sum)

ag1$Year <- factor(ag1$Year)

p <- ggplot(ag1, aes(decade, x/31))

p + geom_boxplot(notch=TRUE)

#### Analysis part 2

The problem appears once years are plotted; It seems more of the years in the first decade have a low proportion days with rain, but some of the last decade also have less days with rain.p <- ggplot(ag1, aes(decade, x/31,col=Year))

p + geom_boxplot()

As written lme4 was my package of choice. This does give a significant effect, but it only at 3%.

library(lme4)

l1 <- glmer(cbind(31-x,x) ~ (1|Year) ,data=ag1,family=binomial)

l1

Generalized linear mixed model fit by the Laplace approximation

Formula: cbind(31 - x, x) ~ (1 | Year)

Data: ag1

AIC BIC logLik deviance

231.3 236.9 -113.7 227.3

Random effects:

Groups Name Variance Std.Dev.

Year (Intercept) 0.38069 0.617

Number of obs: 120, groups: Year, 20

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.3950 0.1423 -2.775 0.00552 **

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

l2

Generalized linear mixed model fit by the Laplace approximation

Formula: cbind(31 - x, x) ~ decade + (1 | Year)

Data: ag1

AIC BIC logLik deviance

228.5 236.9 -111.3 222.5

Random effects:

Groups Name Variance Std.Dev.

Year (Intercept) 0.29496 0.5431

Number of obs: 120, groups: Year, 20

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.1019 0.1783 -0.572 0.5674

decade2004-2013 -0.5862 0.2528 -2.319 0.0204 *

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:

(Intr)

dc2004-2013 -0.705

anova(l1,l2)

Data: ag1

Models:

l1: cbind(31 - x, x) ~ (1 | Year)

l2: cbind(31 - x, x) ~ decade + (1 | Year)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)

l1 2 231.31 236.88 -113.65

l2 3 228.53 236.90 -111.27 4.772 1 0.02893 *

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

To

**leave a comment**for the author, please follow the link and comment on his blog:**Wiekvoet**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...