Experimental Design: Problem Set

Posted on June 21, 2012 by Al-Ahmadgaid Asaad in R bloggers | 0 Comments

[This article was first published on ALSTAT R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

QUESTIONS

The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. The following data have been collected:

Mixing Techniques	Tensile Strength (lb/in²)
1	3129	3000	2865	2890
2	3200	3300	2975	3150
3	2800	2900	3985	3050
4	2600	2700	2600	2765

Test the hypothesis that mixing techniques affect the strength of the cement. Use $\alpha=0.05$.
Construct a graphical display as described in Section 3-5.3 to compare the mean tensile strengths for the four mixing techniques. What are your conclusions?
Use the Fisher LSD method with $\alpha=0.05$ to make comparisons between pairs of means.
Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
Plot the residuals versus the predicted tensile strength. Comment on the plot.
Prepare a scatter plot of the results to aid the interpretation of the results of this experiment

Rework part (b) of Problem 3-1 using Duncan’s multiple range test with $\alpha=0.05$. Does this make any difference in your conclusions?
Rework part (b) of Problem 3-1 using Tukey’s test with $\alpha=0.05$. Do you get the same conclusions from Tukey’s test that you did from the graphical procedure and/or Duncan’s multiple range test?

COMPUTATIONAL AND GRAPHICAL SECTION

The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. The following data have been collected:

Mixing Techniques	Tensile Strength (lb/in²)					Totals $(y_{i})$		Averages $(\bar{y}_{i})$
1	3129	3000	2865	2890		11884		2971
2	3200	3300	2975	3150		12625		3156.25
3	2800	2900	2985	3050		11735		2933.75
4	2600	2700	2600	2765		10665		2666.25
					$y_{..}$=46909		$\bar{y}_{..}$=2931.81

Test the hypothesis that mixing techniques affect the strength of the cement. Use $\alpha=0.05$.

I. Hypotheses:

H₀: $\mu_{1}=\mu_{2}=\mu_{3}=\mu_{4}$

H₁: some means are different.

II. Level of significance: $\alpha = 0.05$

III. Test Statistics: $$F_{0}=\frac{\frac{SS_{Treatments}}{a-1}}{\frac{SS_{E}}{N-a}}=\frac{MS_{Treatments}}{MS_{E}}$$

IV. Rejection Region:
$$F_{0}>F_{\alpha,a-1,N-a}\\F_{0}>F_{0.05,3,12}\\F_{0}>3.49$$

V. Computation:
$$SS_{T}=\sum_{i=1}^5\sum_{j=1}^5y_{ij}^2-\frac{y_{..}^2}{N}\\=(3129)^2+(3000)^2+\dots+(2600)^2+(2765)^2-\frac{(46909)^2}{16}\\=138172041-\frac{(46909)^2}{16}=643648.4375\\SS_{Treatments}=\frac{1}{n}\sum_{i=1}^5y_{i.}^2-\frac{y_{..}^2}{N}\\\frac{1}{4}[(11884)^2+\dots+(10665)^2]-\frac{(46909)^2}{16}=489740.1875\\SS_{E}=SS_{T}-SS_{Treatments}\\=643648.4375-489740.1875=153908.25$$

ANOVA Table
Source	Sum of Squares	Degrees of Freedom	Mean Square	F₀	P-Value
Model	489740.19	3	163246.73	12.73	0.0005
Error	153908.25	12	12825.69
Total	643648.44	15

The F-value of 12.73 implies that the model is significant, since it is greater than the tabulated value, 3.49. And the p-value of it is also less than the level of significance. Thus, will lead to the rejection of the null hypothesis and conclude that the mean techniques affect the strength of the cement significantly.

Construct a graphical display as described in Section 3-5.3 to compare the mean tensile strengths for the four mixing techniques. What are your conclusions?

Dashed line in the plot by color: Red – $\bar{y}_{4}$ Mean of Treatment 4 (2666.25)

Pink – $\bar{y}_{..}$ Grand Mean (2931.81)

Brown – $\bar{y}_{3}$ Mean of Treatment 3 (2933.75)

Green – $\bar{y}_{1}$ Mean of Treatment 1 (2971.00)

Blue – $\bar{y}_{2}$ Mean of Treatment 2 (3156.25)

Based on the plot and from the data also, we would conclude that $\bar{y}_{1}$ and $\bar{y}_{3}$ are the same, refer also to plot of question 1, the sixth one. Morever, the $\bar{y}_{4}$ differs from that of $\bar{y}_{1}$ and $\bar{y}_{3}$, and that $\bar{y}_{2}$ differs from $\bar{y}_{1}$ and $\bar{y}_{3}$, and that $\bar{y}_{2}$ and $\bar{y}_{4}$ are different.

How did I do it?

First thing we need to do is to make a student t distribution with degrees of freedom $N-1=15$. After having that plot, we need to insert the four means of the treatment and locate it in the x-values. Now, since the mean values are not seen on the plot because it’s too large, we then convert it first to t-values, using the following formula,$$t=\frac{\bar{y}_{i}-\bar{y}_{..}}{\frac{\sigma}{\sqrt{n}}}$$

Use the Fisher LSD method with $\alpha = 0.05$ to make comparisons between pairs of means.$$LSD=t_{\frac{\alpha}{2},N-a}\sqrt{\frac{2MS_{E}}{n}}=t_{0.025,16-4}\sqrt{\frac{2(12825.7)}{4}}=2.179\sqrt{6412.85}=174.495$$

Thus, any pair of treatment averages that differ in absolute value by more than 174.495 would imply that the corresponding pair of population means are significantly different.

The differences in averages are$$\bar{y}_{1.}-\bar{y}_{2.}=2971.00-3156.25=-185.25>174.495*\\\bar{y}_{1.}-\bar{y}_{3.}=2971.00-2933.75=37.25<174.495\\\bar{y}_{1.}-\bar{y}_{4.}=2971.00-2933.75=304.75>174.495*\\\bar{y}_{2.}-\bar{y}_{3.}=3156.25-2933.75=222.25>174.495*\\\bar{y}_{2.}-\bar{y}_{4.}=3156.25-2666.25=490.00>174.495*\\\bar{y}_{3.}-\bar{y}_{4.}=2933.75-2666.25=267.5>174.495*$$

The starred values indicate pairs of means that are significantly different.

Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?

Nothing is unusual in the plot. Thus, the residuals met the normality assumption since the points fluctuate within the 95 percent confidence interval.

Plot the residuals versus the predicted tensile strength. Comment on the plot.

The plot exhibits a little outward-opening funnel or megaphone, though not too obvious but still affect the non-constancy of the error variance.

Prepare a scatter plot of the results to aid the interpretation of the results of this experiment.

2. Rework part (b) of Problem 3-1 using Duncan’s multiple range test with . Does this make any difference in your conclusions?

Ranking the treatment averages in ascending order, we have$$\bar{y}_{4.}=2666.25\\\bar{y}_{3.}=2933.75\\\bar{y}_{1.}=2971.00\\\bar{y}_{2.}=3156.25$$
The standard error of each average is $S_{\bar{y}_{i}}=\sqrt{\frac{12825.69}{4}}=56.625$. From the table of significant ranges for 12 degrees of freedom and $\alpha=0.05$, we obtain $r_{0.05}(2,12)=3.081,r_{0.05}(3,12)=3.225,$ and $r_{0.05}(4,12)=3.312$. Thus, the least significant ranges are$$R_{2}=r_{0.05}(2,20)S_{\bar{y}_{i.}}=(3.081)(56.625)=174.46\\R_{3}=r_{0.05}(3,12)S_{\bar{y}_{i.}}=(3.312)(56.625)=182.62\\R_{4}=r_{0.05}(4,12)S_{\bar{y}_{i.}}=(3.312)(56.625)=187.54$$

The comparison would yield$$2 vs. 4: 3156.25-2666.25=490>187.54(R_{4})\\2 vs. 3: 3156.25-2933.75=222.5>182.62(R_{3})\\2 vs. 1: 3156.25-2971.00=185.25>174.46(R_{2})\\1 vs. 4: 2971.00-2666.25=304.75>182.62(R_{3})\\1 vs. 3: 2971.00-2933.75=37.25<174.46(R_{2})\\3 vs. 4: 2933.75-2666.25=267.5>174.46(R_{2})$$

From the analysis we observed that there are significant differences between all pairs of means except 1 and 3. This makes no difference in the previous conclusion of LSD method, which confirms that the Duncan’s multiple range test and the LSD method produce identical conclusions.

Rework part (b) of Problem 3-1 using Tukey’s test with $\alpha=0.05$. Do you get the same conclusions from Tukey’s test that you did from the graphical procedure and/or Duncan’s multiple range test?$$T_{0.05}=q_{0.05}(4,12)\sqrt{\frac{MS_{E}}{n}}=4.20\sqrt{\frac{12825.69}{4}}=4.20(56.625)=237.825$$

Thus, any pair of treatment averages that differ in absolute value by more than 237.825 would imply that the corresponding pair of population means are significantly different. The four treatment averages are,$$\bar{y}_{1.}=2971.00~~~~~\bar{y}_{2.}=3156.25~~~~~\bar{y}_{3.}=2933.75~~~~~\bar{y}_{4.}=2666.25$$ And the differences in averages are$$\bar{y}_{1.}-\bar{y}_{2.}=2971.00-3156.25=-185.25\\\bar{y}_{1.}-\bar{y}_{3.}=2971.00-2933.75=37.25\\\bar{y}_{1.}-\bar{y}_{4.}=2971.00-2666.25=304.75*\\\bar{y}_{2.}-\bar{y}_{3.}=3156.25-2933.75=222.5\\\bar{y}_{2.}-\bar{y}_{4.}=3156.25-2666.25=490*\\\bar{y}_{3.}-\bar{y}_{4.}=2933.75-2666.75=267.5*$$ The starred values indicate pairs of means that are significantly different.

The conclusions are not the same. The mean of Treatment 4 is different than the mean of Treatments 1, 2, and 3 in Duncans. However, the mean of Treatment 1 and mean of Treatment 2 is not different in Tukey computation as well as the mean of Treatment 1 and mean of Treatment 3. They were found to be different using the graphical method and the Fisher LSD method.

Reference:
Design and Analysis of Experiments by Douglas C. Montgomery

R CODES SECTION

To leave a comment for the author, please follow the link and comment on their blog: ALSTAT R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Experimental Design: Problem Set

QUESTIONS

COMPUTATIONAL AND GRAPHICAL SECTION

R CODES SECTION

Related

QUESTIONS

COMPUTATIONAL AND GRAPHICAL SECTION

R CODES SECTION

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)