Latin squares design in R

January 6, 2010

(This article was first published on Statistic on aiR, and kindly contributed to R-bloggers)

The Latin square design is used where the researcher desires to control the variation in an experiment that is related to rows and columns in the field.
Remember that:
* Treatments are assigned at random within rows and columns, with each treatment once per row and once per column.
* There are equal numbers of rows, columns, and treatments.
* Useful where the experimenter desires to control variation in two different directions

The formula used for this kind of three-way ANOVA are:

Source of
Degrees of
Sums of
squares (SSQ)
square (MS)
Rows (R) r-1 SSQR SSQR/(r-1) MSR/MSE
Columns (C) r-1 SSQC SSQC/(r-1) MSC/MSE
Treatments (Tr) r-1 SSQTr SSQTr/(r-1) MSTr/MSE
Error (E) (r-1)(r-2) SSQE SSQE/((r-1)(r-2))  
Total (Tot) r2-1 SSQTot    
awhere r = number of (treatments=rows=columns).

Suppose you want to analyse the productivity of 5 kind on fertilizer, 5 kind of tillage, and 5 kind of seed. The data are organized in a latin square design, as follow:

treatA treatB treatC treatD treatE
fertilizer1 "A42" "C47" "B55" "D51" "E44"
fertilizer2 "E45" "B54" "C52" "A44" "D50"
fertilizer3 "C41" "A46" "D57" "E47" "B48"
fertilizer4 "B56" "D52" "E49" "C50" "A43"
fertilizer5 "D47" "E49" "A45" "B54" "C46"

The three factors are: fertilizer (fertilizer1:5), tillage (treatA:E), seed (A:E). The numbers are the productivity in cwt / year.

Now create a dataframe in R with these data:

fertil <- c(rep("fertil1",1), rep("fertil2",1), rep("fertil3",1), rep("fertil4",1), rep("fertil5",1))
treat <- c(rep("treatA",5), rep("treatB",5), rep("treatC",5), rep("treatD",5), rep("treatE",5))
seed <- c("A","E","C","B","D", "C","B","A","D","E", "B","C","D","E","A", "D","A","E","C","B", "E","D","B","A","C")
freq <- c(42,45,41,56,47, 47,54,46,52,49, 55,52,57,49,45, 51,44,47,50,54, 44,50,48,43,46)

mydata <- data.frame(treat, fertil, seed, freq)


treat fertil seed freq
1 treatA fertil1 A 42
2 treatA fertil2 E 45
3 treatA fertil3 C 41
4 treatA fertil4 B 56
5 treatA fertil5 D 47
6 treatB fertil1 C 47
7 treatB fertil2 B 54
8 treatB fertil3 A 46
9 treatB fertil4 D 52
10 treatB fertil5 E 49
11 treatC fertil1 B 55
12 treatC fertil2 C 52
13 treatC fertil3 D 57
14 treatC fertil4 E 49
15 treatC fertil5 A 45
16 treatD fertil1 D 51
17 treatD fertil2 A 44
18 treatD fertil3 E 47
19 treatD fertil4 C 50
20 treatD fertil5 B 54
21 treatE fertil1 E 44
22 treatE fertil2 D 50
23 treatE fertil3 B 48
24 treatE fertil4 A 43
25 treatE fertil5 C 46

We can re-create the original table, using the matrix function:

matrix(mydata$seed, 5,5)

[,1] [,2] [,3] [,4] [,5]
[1,] "A" "C" "B" "D" "E"
[2,] "E" "B" "C" "A" "D"
[3,] "C" "A" "D" "E" "B"
[4,] "B" "D" "E" "C" "A"
[5,] "D" "E" "A" "B" "C"

matrix(mydata$freq, 5,5)

[,1] [,2] [,3] [,4] [,5]
[1,] 42 47 55 51 44
[2,] 45 54 52 44 50
[3,] 41 46 57 47 48
[4,] 56 52 49 50 43
[5,] 47 49 45 54 46

Before proceeding with the analysis of variance of this Latin square design, you should perform a Boxplot, aimed to have an idea of what we expect:

plot(freq ~ fertil+treat+seed, mydata)

Note that the differences considering the fertilizer is low; it is medium considering the tillage, and is very high considering the seed.
Now confirm these graphics observations, with the ANOVA table:

myfit <- lm(freq ~ fertil+treat+seed, mydata)

Analysis of Variance Table

Response: freq
Df Sum Sq Mean Sq F value Pr(>F)
fertil 4 17.760 4.440 0.7967 0.549839
treat 4 109.360 27.340 4.9055 0.014105 *
seed 4 286.160 71.540 12.8361 0.000271 ***
Residuals 12 66.880 5.573
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Well, the boxplot was useful. Look at the significance of the F-test.
– The difference between group considering the fertilizer is not significant (p-value > 0.1);
– The difference between group considering the tillage is quite significant (p-value < 0.05);
– The difference between group considering the seed is very significant (p-value < 0.001);

To leave a comment for the author, please follow the link and comment on their blog: Statistic on aiR. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)