Editing Problems 3-1 through 3-3, Analysis of variance (section)

==Solution==
[[Image:3-1dataplot.png|thumb|left|'''Figure 1:''' Illustration of data.]]

In these problems, we are given a data set with <math>a=4</math> subsets, each containing <math>n=4</math> values for a total of <math>N=a n=16</math> data points. Each subset contains measurements of tensile strength of cement samples that were produced with a different mixing technique, or '''treatment'''. We define the mean over all data points as the '''grand mean''', and the mean of each point within a given treatment as the '''treatment mean'''. To compare these data subsets, it is useful to think of each data point as the sum of the grand mean <math>\mu</math>, the ith treatment mean <math>\tau_i</math>, and a random error ϵ<sub>ij</sub> specific to the jth data point in the ith treatment (refer to figure 1).

<center><math>y_{ij}=\mu+\tau_i+\epsilon_{ij}\begin{cases}i=1,2,\ldots,a\\j=1,2,\ldots,n\end{cases}</math></center>

Since we are given a finite set of data, we must approximate these means by calculating sample means. The grand sample mean is given by:

<center><math>\bar{y}..=\frac{1}{N}\sum_{i=1}^a \sum_{j=1}^n y_{ij} = 2932</math></center>

The sample treatment means are given by:

<center><math>\bar{y}_i.=\frac{1}{n}\sum_{j=1}^n y_{ij}</math> where <math>i=1, 2, \ldots, a</math></center>

The dot indicates that you are summing over the variable it replaces.

===Section 3-1 (A): Hypothesis testing===
[[Image:3-1error_comparison.png|thumb|360px|left|'''Figure 2:''' Comparison of hypothetical data sets for which most of the error is between the treatment means (left) and within the treatment means (right).]]
We would like to know if one of our data subsets is significantly different from the others, as this may indicate that one of our manufacturing techniques is superior (or inferior) to the others.  To compare our data subsets we are interested in whether most of the error is within the treatments (ϵ) or between the treatments (<math>\tau</math>).  If most of the error is between the treatment means, then we can claim there are significant differences between them.  If there is too much error within the treatment means we cannot claim that they are significantly different (see figure x).  Mathematically, we can approximate the error between the treatment means as

<center><math>\mathrm{MS_{Treatments}}=\frac{n \sum_{i=1}^a (\bar{y}_i.-\bar{y}..)^2}{a-1}=163,247</math></center>

To approximate the error within the treatment means, it is easiest to subtract the error between the means from the total error:

<center><math>\mathrm{MS_{Error}}=\frac{\sum_{i=1}^a \sum_{j=1}^n (y_{ij}-\bar{y}..)^2-n \sum_{i=1}^a (\bar{y}_i.-\bar{y}..)^2}{N-a}</math><math>=\frac{\sum_{i=1}^a \sum_{j=1}^n y_{ij}^2-\frac{1}{n}\sum_{i=1}^{a} y_i.^2}{N-a}=12,826</math></center>

We are interested in the ratio

<center><math>F_0=\frac{\mathrm{MS_{Treatments}}}{\mathrm{MS_{Error}}}=12.7</math></center>

To determine whether or not there are significant differences between our treatments, we will compare F<sub>0</sub> to <math>F_{\alpha,\,a-1,\,N-a}=3.5</math> from the F distribution. In Excel this value can be found using the function <tt>FINV(α,a−1,N−a)</tt>; in R it can be found using <tt>qf(1−α,a−1,N−a)</tt>. If <math>F_0>F_{\alpha,\,a-1,\,N-a}</math>, which is the case here, then the error between treatment means is large enough compared to the error within treatment means to conclude that there is a significant difference between at least one treatment and the others.

<table cellspacing=0 cellpadding=5 style="border-top: 1px solid black; border-bottom: 1px solid black">
<tr align="center">
<th bgcolor="#eeeeee" style="border-bottom: 1px solid black;"> Source of Variation </th>
<th bgcolor="#eeeeee" style="border-bottom: 1px solid black;"> Sum of Squares </th>
<th bgcolor="#eeeeee" style="border-bottom: 1px solid black;"> Degrees of Freedom </th>
<th bgcolor="#eeeeee" style="border-bottom: 1px solid black;"> Mean Square </th>
<th bgcolor="#eeeeee" style="border-bottom: 1px solid black;"> <math>F_0</math> </th>
<th bgcolor="#eeeeee" style="border-bottom: 1px solid black;"> <math>F_{\alpha}</math> </th>
</tr>
<tr align="center">
<th bgcolor="#eeeeee"> Mixing Technique </th>
<td> 489740 </td>
<td> 3      </td>
<td> 163247 </td>
<td> 12.728 </td>
<td> 3.490  </td>
</tr>
<tr align="center">
<th bgcolor="#eeeeee"> Error </th>
<td> 153908 </td>
<td> 12     </td>
<td> 12826  </td>
</tr>
<tr align="center">
<th bgcolor="#eeeeee" style="border-top: 1px solid black;"> Total </th>
<td style="border-top: 1px solid black;"> 643648 </td>
<td style="border-top: 1px solid black;"> 15     </td>
<td style="border-top: 1px solid black;"></td>
<td style="border-top: 1px solid black;"></td>
<td style="border-top: 1px solid black;"></td>
</tr>
</table>

===Section 3-1 (B): Graphical display to compare mean tensile strengths===
[[Image:3-1b.png|thumb|left|Comparison of data to a T distribution.]]
A relatively simple way to visualize the treatment means, and see whether or not they are statistically equal (qualitatively), is to simply plot the four averages on the same graph as a T distribution.  We need to know what mean and standard deviation to use for our T distribution.  For the mean, we will simply use the grand mean, and we will approximate the standard deviation with <math>\sqrt{\mathrm{MS_{Error}}/n}=\sqrt{12826/4}=56.6</math>. This approximation for the standard deviation relies on <math>\mathrm{MS_{Error}}</math>, which does not take into account the differences between the treatment means.  It assumes that the treatment means are all equal – if they are not statistically equivalent, it will be obvious when we plot the treatment means on the same plot as the distribution.

Looking at the plot on the left, we see that for our data it is unlikely that all of the treatment means come from the plotted distribution.  The two treatment means under the tails of the T distribution appear to be significantly different from those under the center.

===Section 3-1 (C): Fisher LSD comparisons===
Fisher LSD comparisons allow each pair of treatment means to be compared.  This is done using a t-test as we did in problem 2.11 B (solution: [http://www.jlab.org/~pcarter/stats/2-11.xlsx Excel], [http://www.jlab.org/~pcarter/stats/2-11.R R]), but replacing <math>S_p</math> with <math>\sqrt{\mathrm{MS_E}}</math>:

<center><math>t_0=\frac{\bar{y}_i.-\bar{y}_j.}{\sqrt{\frac{2\mathrm{MS_E}}{n}}}</math></center>

Solving for <math>\bar{y}_{i.} - \bar{y}_{j.}</math> yields:

<center><math>\bar{y}_i.-\bar{y}_j.=t_0\sqrt{\frac{2\mathrm{MS_E}}{n}}</math></center>

We will compare this to a theoretical value called the least significant difference:

<center><math>\mathrm{LSD}=t_{\alpha/2,~N-a} \sqrt{\frac{2\mathrm{MS_E}}{n}}=174.5</math></center>

If <math>|\bar{y}_i-\bar{y}_j| > \mathrm{LSD}</math> then the treatment means <math>i</math> and <math>j</math> are significantly different.

In Excel, you can calculate LSD using <tt>TINV(α,N-a)*sqrt(2*MSe/n)</tt>, where <tt>MSe</tt> is <math>\mathrm{MS_{Error}}</math>. In R, the equivalent command is <tt>qt(1-α/2,N-a)*sqrt(2*MSe/n)</tt>.

The following table shows the differences between each pair of treatment means.  Differences highlighted in <span style="background-color: #ddddff">blue</span> are large enough for that pair to be considered significantly different.

<table border=1 cellspacing=0 cellpadding=4>
<tr><th colspan=4 align=center><math>|\bar{y}_i-\bar{y}_j|</math></th></tr>
<tr><th bgcolor="#eeeeee"></th><th bgcolor="#eeeeee">2</th><th bgcolor="#eeeeee">3</th><th bgcolor="#eeeeee">4</th></tr>
<tr><th bgcolor="#eeeeee">1</th><td align='right' bgcolor="#ddddff">185.25</td><td align='right'>37.25</td><td align='right' bgcolor="#ddddff">304.75</td></tr>
<tr><th bgcolor="#eeeeee">2</th><td align='right'></td><td align='right' bgcolor="#ddddff">222.50</td><td align='right' bgcolor="#ddddff">490.00</td></tr>
<tr><th bgcolor="#eeeeee">3</th><td align='right'></td><td align='right'></td><td                 align='right' bgcolor="#ddddff">267.50</td></tr>
</table>

To apply the Fisher LSD method to our data, we will compare LSD to each of the numbers in the chart to the left. For example, we see that 185.25 > 174.5, so there is a statistically significant difference between treatment 1 and 2.

===Section 3-1 (D): Normal probability plot===
[[Image:npp.png|thumb|left|Normal probability plot.]]

We have been assuming that our data is distributed normally (on a Gaussian), and that it is therefore valid to do t-tests. To be sure, we should check our normality assumption by creating a normal probability plot. This is done by plotting the residuals against values from a z-distribution.  Residuals are calculated by subtracting the corresponding treatment mean from each data point, and must be sorted before using them to make the plot.  The values we seek from a z-distribution are obtained by doing <tt>NORMSINV(percent)</tt> in Excel, or <tt>qnorm(percent)</tt> in R. In these commands, <tt>percent</tt> is a number from 1/(dof+1) to dof/(dof+1) where <tt>dof</tt> is the degrees of freedom. These commands return z-distribution values that represent ideal residual values.  If the resulting plot is roughly linear, then the normality assumption is valid.
<br clear="all" />

===Section 3-1 (E): Plot of residuals vs. predicted tensile strength===
[[Image:3-1e.png|thumb|left|Residuals vs. predicted tensile strength.]]
As an estimate of the tensile strength for each treatment, we use the treatment mean. The plot of residuals vs. their treatment means gives an indication of the relative sizes of errors between (x-axis) and within (y-axis) treatments.

<br clear="all" />

===Section 3-1 (F): Plot of all data===
[[Image:3-1dataplot.png|Plot of all data.]]

===Section 3-2 (A): Tukey test===
Tukey's test is similar to Fisher LSD comparisons in that they allow pairs of treatment means to be compared. However, instead of using the t-statistic, Tukey's test uses the Studentized range statistic q:

<center><math>q=\frac{\bar{y}_{max}-\bar{y}_{min}}{\sqrt{\mathrm{MS_{Error}/n}}}</math></center>

Solving for <math>\bar{y}_{max}-\bar{y}_{min}</math> yields:

<center><math>\bar{y}_{max}-\bar{y}_{min} = q\sqrt{\mathrm{MS_{Error}}/n}</math></center>

We will compare this with the theoretical value:

<center><math>T_\alpha = q_\alpha(a,~f) \sqrt{\mathrm{MS_{Error}}/n}=4.2 \sqrt{12826/4}=237.75</math></center>

This can be calculated in R using <tt>qtukey(1-alpha,a,N-a)*sqrt(MSe/n)</tt> where <tt>MSe</tt> is <math>\mathrm{MS_{Error}}</math>. If <math>|\bar{y}_i - \bar{y}_j| > T_\alpha</math>, there is a significant difference between the two treatments.

We now compare this statistic to the differences between the treatment means. Differences highlighted in <span style="background-color: #ddddff">blue</span> are large enough for that pair to be considered significantly different. 

<table border=1 cellspacing=0 cellpadding=4>
<tr><th colspan=4 align=center><math>|\bar{y}_i-\bar{y}_j|</math></th></tr>
<tr><th bgcolor="#eeeeee"></th><th bgcolor="#eeeeee">2</th><th bgcolor="#eeeeee">3</th><th bgcolor="#eeeeee">4</th></tr>
<tr><th bgcolor="#eeeeee">1</th><td align='right'>185.25</td><td align='right'>37.25</td><td      align='right' bgcolor="#ddddff">304.75</td></tr>
<tr><th bgcolor="#eeeeee">2</th><td align='right'></td><td align='right'>222.50</td><td           align='right' bgcolor="#ddddff">490.00</td></tr>
<tr><th bgcolor="#eeeeee">3</th><td align='right'></td><td align='right'></td><td                 align='right' bgcolor="#ddddff">267.50</td></tr>
</table>

===Section 3-2 (B): Difference between Tukey and Fisher procedures===
The Fisher procedure uses the T-statistic to compare pairs of treatment means, while the Tukey test uses the Studentized range statistic.  One consequence of this is that the Fisher procedure controls the error rate <math>\alpha</math> for each individual pairwise comparison, whereas the Tukey test controls the overall error rate.

===Section 3-3: Confidence intervals===
<div style="float:left; vertical-align: top; padding-right: 20px; padding-bottom: 20px;">
[[Image:3-1fake3.png|thumb|none|'''Figure 7:''' Confidence interval on the mean tensile strength for each mixing technique.]]
<br>
[[Image:3-1fake4.png|thumb|none|'''Figure 8:''' Confidence interval on the differences in means.]]</div>
We want to find a 95% confidence interval on the mean tensile strength for each mixing technique. The upper bound of each confidence interval is the treatment mean plus the least significant difference, <math>\bar{y}_i.+\mathrm{LSD}</math>. The lower bound is <math>\bar{y}_i.-\mathrm{LSD}</math>. LSD was calculated for <math>\alpha</math>=0.95, so this gives us a 95% confidence interval (see figure 7).

<center><math>\mathrm{LSD}=t_{\alpha/2,~N-a} \sqrt{\frac{2\mathrm{MS_E}}{n}}=174.5</math></center>

<table border=1 cellspacing=0 cellpadding=4>
<tr><th bgcolor="#eeeeee"></th><th bgcolor="#eeeeee">lower bound</th><th bgcolor="#eeeeee">treatment mean</th><th bgcolor="#eeeeee">upper bound</th></tr>
<tr><th bgcolor="#eeeeee">Treatment 1</th><td align='right'>2848</td><td align='right'>2971</td><td align='right'>3094</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 2</th><td align='right'>3033</td><td align='right'>3156</td><td align='right'>3280</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 3</th><td align='right'>2810</td><td align='right'>2933</td><td align='right'>3057</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 4</th><td align='right'>2543</td><td align='right'>2666</td><td align='right'>2790</td></tr>
</table>
To find the confidence interval on the differences in means, we simply subtract to get the difference between our treatment means, and then use the formula above to calculate the confidence interval (see figure 8).

<table border=1 cellspacing=0 cellpadding=4>
<tr><th bgcolor="#eeeeee"></th><th bgcolor="#eeeeee">lower bound</th><th bgcolor="#eeeeee"><math>\bar{y}_{i.} - \bar{y}_{j.}</math></th><th bgcolor="#eeeeee">upper bound</th></tr>
<tr><th bgcolor="#eeeeee">Treatment 1 - 2</th><td align='right'>-359</td><td align='right'>-185</td><td align='right'>-10</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 1 - 3</th><td align='right'>-137</td><td align='right'>37</td><td align='right'>211</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 1 - 4</th><td align='right'>130</td><td align='right'>304</td><td align='right'>479</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 2 - 3</th><td align='right'>48</td><td align='right'>222</td><td align='right'>396</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 2 - 4</th><td align='right'>315</td><td align='right'>490</td><td align='right'>664</td></tr>
<tr><th bgcolor="#eeeeee">Treatment 3 - 4</th><td align='right'>93</td><td align='right'>267</td><td align='right'>441</td></tr>
</table>