# Experimental Design: Problem Set

June 21, 2012
By

(This article was first published on ALSTAT R Blog, and kindly contributed to R-bloggers)

#### QUESTIONS

1. The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. The following data have been collected:
 Mixing Techniques Tensile Strength (lb/in­­2) 1 3129 3000 2865 2890 2 3200 3300 2975 3150 3 2800 2900 3985 3050 4 2600 2700 2600 2765
• Test the hypothesis that mixing techniques affect the strength of the cement. Use $\alpha=0.05$.
• Construct a graphical display as described in Section 3-5.3 to compare the mean tensile strengths for the four mixing techniques. What are your conclusions?
• Use the Fisher LSD method with $\alpha=0.05$ to make comparisons between pairs of means.
• Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
• Plot the residuals versus the predicted tensile strength. Comment on the plot.
• Prepare a scatter plot of the results to aid the interpretation of the results of this experiment

2.

• Rework part (b) of Problem 3-1 using Duncan’s multiple range test with $\alpha=0.05$. Does this make any difference in your conclusions?
• Rework part (b) of Problem 3-1 using Tukey’s test with $\alpha=0.05$. Do you get the same conclusions from Tukey’s test that you did from the graphical procedure and/or Duncan’s multiple range test?

#### COMPUTATIONAL AND GRAPHICAL SECTION

1. The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. The following data have been collected:
 Mixing Techniques Tensile Strength (lb/in­­2) Totals $(y_{i})$ Averages $(\bar{y}_{i})$ 1 3129 3000 2865 2890 11884 2971 2 3200 3300 2975 3150 12625 3156.25 3 2800 2900 2985 3050 11735 2933.75 4 2600 2700 2600 2765 10665 2666.25 $y_{..}$=46909 $\bar{y}_{..}$=2931.81

• Test the hypothesis that mixing techniques affect the strength of the cement. Use $\alpha=0.05$.

I.      Hypotheses:
0: $\mu_{1}=\mu_{2}=\mu_{3}=\mu_{4}$
H1: some means are different.
II.     Level of significance: $\alpha = 0.05$
III.    Test Statistics: $$F_{0}=\frac{\frac{SS_{Treatments}}{a-1}}{\frac{SS_{E}}{N-a}}=\frac{MS_{Treatments}}{MS_{E}}$$
IV.   Rejection Region:
$$F_{0}>F_{\alpha,a-1,N-a}\\F_{0}>F_{0.05,3,12}\\F_{0}>3.49$$
V.    Computation:
$$SS_{T}=\sum_{i=1}^5\sum_{j=1}^5y_{ij}^2-\frac{y_{..}^2}{N}\\=(3129)^2+(3000)^2+\dots+(2600)^2+(2765)^2-\frac{(46909)^2}{16}\\=138172041-\frac{(46909)^2}{16}=643648.4375\\SS_{Treatments}=\frac{1}{n}\sum_{i=1}^5y_{i.}^2-\frac{y_{..}^2}{N}\\\frac{1}{4}[(11884)^2+\dots+(10665)^2]-\frac{(46909)^2}{16}=489740.1875\\SS_{E}=SS_{T}-SS_{Treatments}\\=643648.4375-489740.1875=153908.25$$
 ANOVA Table Source Sum  of Squares Degrees of Freedom Mean  Square F0 P-Value Model 489740.19 3 163246.73 12.73 0.0005 Error 153908.25 12 12825.69 Total 643648.44 15

The F-value of 12.73 implies that the model is significant, since it is greater than the tabulated value, 3.49. And the p-value of it is also less than the level of significance. Thus, will lead to the rejection of the null hypothesis and conclude that the mean techniques affect the strength of the cement significantly.
• Construct a graphical display as described in Section 3-5.3 to compare the mean tensile strengths for the four mixing techniques. What are your conclusions?
Dashed line in the plot by color:     Red – $\bar{y}_{4}$ Mean of Treatment 4 (2666.25)
Pink – $\bar{y}_{..}$ Grand Mean (2931.81)
Brown – $\bar{y}_{3}$ Mean of Treatment 3 (2933.75)
Green – $\bar{y}_{1}$ Mean of Treatment 1 (2971.00)
Blue – $\bar{y}_{2}$ Mean of Treatment 2 (3156.25)

Based on the plot and from the data also, we would conclude that $\bar{y}_{1}$ and $\bar{y}_{3}$ are the same, refer also to plot of question 1, the sixth one. Morever, the $\bar{y}_{4}$ differs from that of $\bar{y}_{1}$ and $\bar{y}_{3}$, and that $\bar{y}_{2}$ differs from $\bar{y}_{1}$ and $\bar{y}_{3}$, and that $\bar{y}_{2}$ and $\bar{y}_{4}$ are different.

How did I do it?
First thing we need to do is to make a student t distribution with degrees of freedom $N-1=15$. After having that plot, we need to insert the four means of the treatment and locate it in the x-values. Now, since the mean values are not seen on the plot because it’s too large, we then convert it first to t-values, using the following formula,$$t=\frac{\bar{y}_{i}-\bar{y}_{..}}{\frac{\sigma}{\sqrt{n}}}$$
• Use the Fisher LSD method with $\alpha = 0.05$ to make comparisons between pairs of means.$$LSD=t_{\frac{\alpha}{2},N-a}\sqrt{\frac{2MS_{E}}{n}}=t_{0.025,16-4}\sqrt{\frac{2(12825.7)}{4}}=2.179\sqrt{6412.85}=174.495$$
Thus, any pair of treatment averages that differ in absolute value by more than 174.495 would imply that the corresponding pair of population means are significantly different.

The differences in averages are$$\bar{y}_{1.}-\bar{y}_{2.}=2971.00-3156.25=-185.25>174.495*\\\bar{y}_{1.}-\bar{y}_{3.}=2971.00-2933.75=37.25<174.495\\\bar{y}_{1.}-\bar{y}_{4.}=2971.00-2933.75=304.75>174.495*\\\bar{y}_{2.}-\bar{y}_{3.}=3156.25-2933.75=222.25>174.495*\\\bar{y}_{2.}-\bar{y}_{4.}=3156.25-2666.25=490.00>174.495*\\\bar{y}_{3.}-\bar{y}_{4.}=2933.75-2666.25=267.5>174.495*$$
The starred values indicate pairs of means that are significantly different.
• Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?

Nothing is unusual in the plot. Thus, the residuals met the normality assumption since the points fluctuate within the 95 percent confidence interval.

• Plot the residuals versus the predicted tensile strength. Comment on the plot.

The plot exhibits a little outward-opening funnel or megaphone, though not too obvious but still affect the non-constancy of the error variance.

• Prepare a scatter plot of the results to aid the interpretation of the results of this experiment.

2. Rework part (b) of Problem 3-1 using Duncan’s multiple range test with . Does this make any difference in your conclusions?

Ranking the treatment averages in ascending order, we have$$\bar{y}_{4.}=2666.25\\\bar{y}_{3.}=2933.75\\\bar{y}_{1.}=2971.00\\\bar{y}_{2.}=3156.25$$
The standard error of each average is $S_{\bar{y}_{i}}=\sqrt{\frac{12825.69}{4}}=56.625$. From the table of significant ranges for 12 degrees of freedom and $\alpha=0.05$, we obtain $r_{0.05}(2,12)=3.081,r_{0.05}(3,12)=3.225,$ and $r_{0.05}(4,12)=3.312$. Thus, the least significant ranges are$$R_{2}=r_{0.05}(2,20)S_{\bar{y}_{i.}}=(3.081)(56.625)=174.46\\R_{3}=r_{0.05}(3,12)S_{\bar{y}_{i.}}=(3.312)(56.625)=182.62\\R_{4}=r_{0.05}(4,12)S_{\bar{y}_{i.}}=(3.312)(56.625)=187.54$$
The comparison would yield$$2 vs. 4: 3156.25-2666.25=490>187.54(R_{4})\\2 vs. 3: 3156.25-2933.75=222.5>182.62(R_{3})\\2 vs. 1: 3156.25-2971.00=185.25>174.46(R_{2})\\1 vs. 4: 2971.00-2666.25=304.75>182.62(R_{3})\\1 vs. 3: 2971.00-2933.75=37.25<174.46(R_{2})\\3 vs. 4: 2933.75-2666.25=267.5>174.46(R_{2})$$
From the analysis we observed that there are significant differences between all pairs of means except 1 and 3. This makes no difference in the previous conclusion of LSD method, which confirms that the Duncan’s multiple range test and the LSD method produce identical conclusions.
• Rework part (b) of Problem 3-1 using Tukey’s test with $\alpha=0.05$. Do you get the same conclusions from Tukey’s test that you did from the graphical procedure and/or Duncan’s multiple range test?$$T_{0.05}=q_{0.05}(4,12)\sqrt{\frac{MS_{E}}{n}}=4.20\sqrt{\frac{12825.69}{4}}=4.20(56.625)=237.825$$
Thus, any pair of treatment averages that differ in absolute value by more than 237.825 would imply that the corresponding pair of population means are significantly different. The four treatment averages are,$$\bar{y}_{1.}=2971.00~~~~~\bar{y}_{2.}=3156.25~~~~~\bar{y}_{3.}=2933.75~~~~~\bar{y}_{4.}=2666.25$$        And the differences in averages are$$\bar{y}_{1.}-\bar{y}_{2.}=2971.00-3156.25=-185.25\\\bar{y}_{1.}-\bar{y}_{3.}=2971.00-2933.75=37.25\\\bar{y}_{1.}-\bar{y}_{4.}=2971.00-2666.25=304.75*\\\bar{y}_{2.}-\bar{y}_{3.}=3156.25-2933.75=222.5\\\bar{y}_{2.}-\bar{y}_{4.}=3156.25-2666.25=490*\\\bar{y}_{3.}-\bar{y}_{4.}=2933.75-2666.75=267.5*$$        The starred values indicate pairs of means that are significantly different.

The conclusions are not the same. The mean of Treatment 4 is different than the mean of Treatments 1, 2, and 3 in Duncans. However, the mean of Treatment 1 and mean of Treatment 2 is not different in Tukey computation as well as the mean of Treatment 1 and mean of Treatment 3. They were found to be different using the graphical method and the Fisher LSD method.

Reference:
Design and Analysis of Experiments by Douglas C. Montgomery

#### R CODES SECTION

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , ,

## Recent popular posts

Contact us if you wish to help support R-bloggers, and place your banner here.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)