Latin squares design in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Latin square design is used where the researcher desires to control the variation in an experiment that is related to rows and columns in the field.
Remember that:
* Treatments are assigned at random within rows and columns, with each treatment once per row and once per column.
* There are equal numbers of rows, columns, and treatments.
* Useful where the experimenter desires to control variation in two different directions
The formula used for this kind of three-way ANOVA are:
Source of variation |
Degrees of freedom^{a} |
Sums of squares (SSQ) |
Mean square (MS) |
F |
Rows (R) | r-1 | SSQ_{R} | SSQ_{R}/(r-1) | MS_{R}/MS_{E} |
Columns (C) | r-1 | SSQ_{C} | SSQ_{C}/(r-1) | MS_{C}/MS_{E} |
Treatments (Tr) | r-1 | SSQ_{Tr} | SSQ_{Tr}/(r-1) | MS_{Tr}/MS_{E} |
Error (E) | (r-1)(r-2) | SSQ_{E} | SSQ_{E}/((r-1)(r-2)) | |
Total (Tot) | r^{2}-1 | SSQ_{Tot} | ||
^{a}where r = number of (treatments=rows=columns). |
Suppose you want to analyse the productivity of 5 kind on fertilizer, 5 kind of tillage, and 5 kind of seed. The data are organized in a latin square design, as follow:
treatA treatB treatC treatD treatE fertilizer1 "A42" "C47" "B55" "D51" "E44" fertilizer2 "E45" "B54" "C52" "A44" "D50" fertilizer3 "C41" "A46" "D57" "E47" "B48" fertilizer4 "B56" "D52" "E49" "C50" "A43" fertilizer5 "D47" "E49" "A45" "B54" "C46"
The three factors are: fertilizer (fertilizer1:5), tillage (treatA:E), seed (A:E). The numbers are the productivity in cwt / year.
Now create a dataframe in R with these data:
fertil
We can re-create the original table, using the matrix function:
matrix(mydata$seed, 5,5) [,1] [,2] [,3] [,4] [,5] [1,] "A" "C" "B" "D" "E" [2,] "E" "B" "C" "A" "D" [3,] "C" "A" "D" "E" "B" [4,] "B" "D" "E" "C" "A" [5,] "D" "E" "A" "B" "C" matrix(mydata$freq, 5,5) [,1] [,2] [,3] [,4] [,5] [1,] 42 47 55 51 44 [2,] 45 54 52 44 50 [3,] 41 46 57 47 48 [4,] 56 52 49 50 43 [5,] 47 49 45 54 46
Before proceeding with the analysis of variance of this Latin square design, you should perform a Boxplot, aimed to have an idea of what we expect:
par(mfrow=c(2,2)) plot(freq ~ fertil+treat+seed, mydata)
Note that the differences considering the fertilizer is low; it is medium considering the tillage, and is very high considering the seed.
Now confirm these graphics observations, with the ANOVA table:
myfit F) fertil 4 17.760 4.440 0.7967 0.549839 treat 4 109.360 27.340 4.9055 0.014105 * seed 4 286.160 71.540 12.8361 0.000271 *** Residuals 12 66.880 5.573 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Well, the boxplot was useful. Look at the significance of the F-test.
– The difference between group considering the fertilizer is not significant (p-value > 0.1);
– The difference between group considering the tillage is quite significant (p-value < 0.05);
– The difference between group considering the seed is very significant (p-value < 0.001);
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.