# Introduction

### Introduction

The following are facts about the *F* distribution:

- The curve is not symmetrical but skewed to the right.
- There is a different curve for each set of
*df*s. - The
*F*statistic is greater than or equal to zero. - As the degrees of freedom for the numerator and for the denominator get larger, the curve approximates the normal.
- Other uses for the
*F*distribution include comparing two variances and two-way analysis of variance. Two-way analysis is beyond the scope of this chapter.

### Example 13.2

Let’s return to the slicing tomato exercise in Try It. The means of the tomato yields under the five mulching conditions are represented by *μ*_{1}, *μ*_{2}, *μ*_{3}, *μ*_{4}, *μ*_{5}. We will conduct a hypothesis test to determine if all means are the same or at least one is different. Using a significance level of 5 percent, test the null hypothesis that there is no difference in mean yields among the five groups against the alternative hypothesis that at least one mean is different from the rest.

The null and alternative hypotheses are as follows:

*H _{0}*:

*μ*=

_{1}*μ*=

_{2}*μ*=

_{3}*μ*=

_{4}*μ*

_{5}*H _{a}*:

*μ*for some

_{i}≠ μ_{j}*i ≠ j*

The one-way ANOVA results are shown in Table 13.5

Source of Variation | Sum of Squares (SS) |
Degrees of Freedom (df) |
Mean Square (MS) |
F |
---|---|---|---|---|

Factor (Between) | 36,648,561 | 5 – 1 = 4 | $\frac{\text{36,648,561}}{\text{4}}\text{=9,162,140}$ | $$\frac{\text{9,162,140}}{\text{2,044,672}\text{.6}}\text{=4}\text{.4810}$$ |

Error (Within) | 20,446,726 | 15 – 5 = 10 | $$\frac{\text{20,446,726}}{\text{10}}\text{=2,044,672}\text{.6}$$ | |

Total | 57,095,287 | 15 – 1 = 14 |

**Distribution for the test: F_{4,10}**

*df*(*num*) = 5 – 1 = 4

*df*(*denom*) = 15 – 5 = 10

**Test statistic:** *F* = 4.4810

**Probability statement:** *p*-value = *P*(*F* > 4.481) = 0.0248

**Compare α and the p-value:**

*α*= 0.05,

*p*-value = 0.0248

**Make a decision:** Since *α* > *p*-value, we reject *H _{0}*.

**Conclusion:** At the 5 percent significance level, we have reasonably strong evidence that differences in mean yields for slicing tomato plants grown under different mulching conditions are unlikely to be due to chance alone. We may conclude that at least some of the mulches led to different mean yields.

### Using the TI-83, 83+, 84, 84+ Calculator

To find these results on the calculator:

Press `STAT`

. Press `1`

:`EDIT`

. Put the data into the lists `L`

._{1}, L_{2}, L_{3}, L_{4}, L_{5}

Press `STAT`

, arrow over to `TESTS`

, and arrow down to `ANOVA`

. Press `ENTER`

, and then enter (`L`

). Press _{1}, L_{2}, L_{3}, L_{4}, L_{5}`ENTER`

. You will see that the values in the foregoing ANOVA table are easily produced by the calculator, including the test statistic and the *p*-value of the test.

The calculator displays:

*F*= 4.4810

*p*= 0.0248 (

*p*-value)

*df*= 4

*SS*= 36648560.9

*MS*= 9162140.23

*df*= 10

*SS*= 20446726

*MS*= 2044672.6

MRSA, or *Staphylococcus aureus*, can cause serious bacterial infections in hospital patients. Table 13.6 shows various colony counts from different patients who may or may not have MRSA. The data from the table is plotted in Figure 13.5.

Conc = 0.6 | Conc = 0.8 | Conc = 1.0 | Conc = 1.2 | Conc = 1.4 |
---|---|---|---|---|

9 | 16 | 22 | 30 | 27 |

66 | 93 | 147 | 199 | 168 |

98 | 82 | 120 | 148 | 132 |

Plot of the data for the different concentrations:

Test whether the mean numbers of colonies are the same or are different. Construct the ANOVA table by hand or by using a TI-83, 83+, or 84+ calculator, find the *p*-value, and state your conclusion. Use a 5 percent significance level.

### Example 13.3

Four sororities took a random sample of sisters regarding their grade means for the past term. The results are shown in Table 13.7.

Sorority 1 | Sorority 2 | Sorority 3 | Sorority 4 |
---|---|---|---|

2.17 | 2.63 | 2.63 | 3.79 |

1.85 | 1.77 | 3.78 | 3.45 |

2.83 | 3.25 | 4.00 | 3.08 |

1.69 | 1.86 | 2.55 | 2.26 |

3.33 | 2.21 | 2.45 | 3.18 |

Using a significance level of 1 percent, is there a difference in mean grades among the sororities?

Let *μ _{1}*,

*μ*,

_{2}*μ*,

_{3}*μ*be the population means of the sororities. Remember that the null hypothesis claims that the sorority groups are from the same normal distribution. The alternate hypothesis says that at least two of the sorority groups come from populations with different normal distributions. Notice that the four sample sizes are each five.

_{4}### Note

This is an example of a *balanced design*, because each factor (i.e., sorority) has the same number of observations.

*H _{0}*:

*μ*=

_{1}*μ*=

_{2}*μ*=

_{3}*μ*

_{4}*H _{a}*: Not all of the means

*μ*,

_{1}*μ*,

_{2}*μ*,

_{3}*μ*are equal.

_{4}**Distribution for the test:** *F*_{3}_{,16}

where *k* = 4 groups and *n* = 20 samples in total.

*df*(*num*)= *k* – 1 = 4 – 1 = 3

*df*(*denom*) = *n* – *k* = 20 – 4 = 16

**Calculate the test statistic:** *F* = 2.23

**Graph**

**Probability statement:** *p*-value = *P*(*F* > 2.23) = 0.1241

**Compare α and the p-value:**

*α*= 0.01

*p*-value = 0.1241

*α*

*p*-value

**Make a decision:** Since *α* *p*-value, you cannot reject *H _{0}*.

**Conclusion:** There is not sufficient evidence to conclude that there is a difference among the mean grades for the sororities.

### Using the TI-83, 83+, 84, 84+ Calculator

Put the data into lists L_{1}, L_{2}, L_{3}, and L_{4}. Press `STAT`

and arrow over to `TESTS`

. Arrow down to `F:ANOVA`

. Press `ENTER`

and enter (`L1,L2,L3,L4`

).

The calculator displays the F statistic, the *p*-value, and the values for the one-way ANOVA table:

*F*= 2.2303

*p*= 0.1241 (

*p*-value)

*df*= 3

*SS*= 2.88732

*MS*= 0.96244

*df*= 16

*SS*= 6.9044

*MS*= 0.431525

Four sports teams took a random sample of players regarding their GPAs for the last year. The results are shown in Table 13.8.

Basketball | Baseball | Hockey | Lacrosse |
---|---|---|---|

3.6 | 2.1 | 4.0 | 2.0 |

2.9 | 2.6 | 2.0 | 3.6 |

2.5 | 3.9 | 2.6 | 3.9 |

3.3 | 3.1 | 3.2 | 2.7 |

3.8 | 3.4 | 3.2 | 2.5 |

Use a significance level of 5 percent and determine if there is a difference in GPA among the teams.

### Example 13.4

A fourth-grade class is studying the environment. One of the assignments is to grow bean plants in different soils. Tommy chose to grow his bean plants in soil found outside his classroom mixed with dryer lint. Tara chose to grow her bean plants in potting soil bought at the local nursery. Nick chose to grow his bean plants in soil from his mother’s garden. No chemicals were used on the plants, only water. They were grown inside the classroom next to a large window. Each child grew five plants. At the end of the growing period, each plant was measured, producing the data in inches in Table 13.9.

Tommy's Plants | Tara's Plants | Nick's Plants |
---|---|---|

24 | 25 | 23 |

21 | 31 | 27 |

23 | 23 | 22 |

30 | 20 | 30 |

23 | 28 | 20 |

Does it appear that the three media in which the bean plants were grown produce the same mean height? Test at a 3 percent level of significance.

This time, we will perform the calculations that lead to the *F'* statistic. Notice that each group has the same number of plants, so we will use the formula *F'* = $\frac{n\cdot {s}_{\overline{x}}{}^{2}}{{s}^{2}{}_{\text{pooled}}}$ .

First, calculate the sample mean and sample variance of each group.

Tommy’s Plants | Tara’s Plants | Nick’s Plants | |
---|---|---|---|

Sample Mean | 24.2 | 25.4 | 24.4 |

Sample Variance | 11.7 | 18.3 | 16.3 |

Next, calculate the variance of the three group means by calculating the variance of 24.2, 25.4, and 24.4. Variance of the group means = 0.413 = ${s}_{\overline{x}}{}^{2}$,

then *MS _{between}* = $n{s}_{\overline{x}}{}^{2}$ = (5)(0.413) where

*n*= 5 is the sample size (number of plants each child grew).

Calculate the mean of the three sample variances (calculate the mean of 11.7, 18.3, and 16.3). Mean of the sample variances = 15.433 = *s*^{2}_{pooled},

then *MS _{within}* =

*s*

^{2}

_{pooled}= 15.433.

The *F* statistic (or *F* ratio) is $F=\frac{M{S}_{\text{between}}}{M{S}_{\text{within}}}=\frac{n{s}_{\overline{x}}{}^{2}}{{s}^{2}{}_{pooled}}=\frac{(5)(0.413)}{15.433}=0.134\text{.}$

The *df*s for the numerator = the number of groups – 1 = 3 – 1 = 2.

The *df*s for the denominator = the total number of samples – the number of groups = 15 – 3 = 12.

The distribution for the test is *F*_{2}_{,12} and the *F* statistic is *F* = 0.134.

The *p*-value is *P*(*F* > 0.134) = 0.8759.

**Decision:** Since *α* = 0.03 and the *p*-value = 0.8759, do not reject *H _{0}*. Why?

**Conclusion:** With a 3 percent level of significance from the sample data, the evidence is not sufficient to conclude that the mean heights of the bean plants are different.

### Using the TI-83, 83+, 84, 84+ Calculator

To calculate the *p*-value:

•Press `2nd DISTR`

,

•Arrow down to `Fcdf`

and press `ENTER`

,

•Enter `0.134, E99, 2, 12`

, and

•Press `ENTER.`

The *p*-value is 0.8759.

Another fourth grader also grew bean plants, but in a jelly-like mass. The heights were (in inches) 24, 28, 25, 30, and 32. Do a one-way ANOVA test on the four groups. Are the heights of the bean plants different? Use the same method as shown in Example 13.4.

### Collaborative Exercise

From the class, create four groups of the same size as follows: men under 22, men at least 22, women under 22, women at least 22. Have each member of each group record the number of states in the United States he or she has visited. Run an ANOVA test to determine if the average number of states visited in the four groups are the same. Test at a 1 percent level of significance. Use one of the solution sheets in Appendix E.