One-Way Analysis of Variance (ANOVA)

Research Scenario

Question

Dana wishes to assess whether vitamin C is effective in the treatment of colds. To evaluate her hypothesis, she decides to conduct a two-year experimental study. She obtains 30 volunteers from undergraduate classes to participate. She randomly assigns an equal number of students to three groups:

placebo (group 1),
low doses of vitamin C (group 2), and
high doses of vitamin C (group 3).

In the first and second years of the study, students in all three groups were monitored to assess the number of days that experienced cold symptoms. During the first year of Dana’s study, participants did not take any vitamin C pills. In the second year of the study, participants took pills that contained one of the following:

no active ingredients—placebo (group 1)
low doses of vitamin C (group 2)
high doses of vitamin C (group 3)

Instrument & Scoring

Dana’s data file includes 30 cases and two variables: a factor distinguishing among the three treatment groups (GROUP) and a dependent variable (DIFF) which is the difference in the number of days participants experienced cold symptoms in the first year versus the number of days they experienced cold symptoms in the second year.

Research Hypothesis

A one-way analysis-of-variance test (ANOVA) is used to assess whether means on a dependent variable are significantly different among groups. In Dana’s research study, she is interested in evaluating whether the mean change in the number of days experiencing cold symptoms differed among her three experimental populations.

If Dana’s overall ANOVA analysis is significant (p value less than .05) and a factor has more than two levels, Dana will conduct follow-up tests to determine whether there is there a relationship between the amount of vitamin C taken and the change in the number of days individuals experienced cold symptoms. Dana’s follow-up tests would involve comparisons between the pairs of means in her study’s treatment groups (placebo, low dose, high dose). To adequately evaluate each treatment group, Dana would compare the means of:

placebo and low dose,
placebo and high dose,
low dose and high dose

using either a follow-up comparison that assumes equal variances such Tukey’s Honestly Significant Difference (HSD) or one that does not such as a Dunnett C test Dana will look to determine where differences among treatment groups exist.

Note

Only Tukey’s HSD and Dunnett C tests are presented below, for additional information on other follow-up comparisons such as LSD, REGWQ or other follow-up comparisons, please review relevant materials.

ANOVA Workflow

In Dana’s current research study, the null hypothesis is that there is no difference in mean change in the number of days participants experienced cold symptoms. If Dana cannot reject the null hypothesis, she would conclude her study and not conduct any follow-up comparisons.

If however, Dana did obtain a significant ANOVA output, she would proceed to conducting follow-up comparisons to determine whether there is a relationship between the amount of vitamin C taken and the change in the number of days that individuals show cold symptoms.

Assumptions

The test variable is normally distributed for each of the populations (as defined by the different levels of the factor).

Note

In many applications with moderate or larger sample sizes, a one-way ANOVA may yield reasonably accurate p values, even when the normality assumption is violated. If the population distributions are substantially nonnormal, alternative preprocessing methods, such as log or reciprocal transformation, should be evaluated. However, if the data appears to be thick-tailed or heavily skewed, a Kruskal-Wallis test should be considered as it does not assume normality and uses a dataset’s median rather than mean.

The variances of the dependent variable are the same for all populations. If this assumption is found to be violated and the sample sizes differ among groups, the resulting p value for the overall F test is untrustworthy. Under these conditions, a Browne-Forsythe test or Welch statistics should be considered. Furthermore, for follow-up comparison, the validity of the results is questionable if the population variances differ regardless of whether the sample sizes are equal or unequal. If the variances are different, it is appropriate to use a follow-up comparison test that does not assume that the population variances are equal. Dunnett’s C test is recommended in instances where the variances are unequal.
The cases represent a random sample from the population, and the scores on the test variable are independent of each other.

Check

Of the three major assumptions for the ANOVA test, only the first two can be evaluated using data alone. The third assumption is something that is best addressed through implementing proper research practices—simply put, if this assumption is found to be violated in the data analysis phase there is little the researcher can do to address the issue.

Scores are Normally Distributed

To evaluate the first assumption of normality, histograms were developed for each of the Dana’s treatment groups and are presented below.

Treatment group histograms offer insight into whether scores are normally distributed

Reviewing the plots above, it appears that the three groups may not approximate a normal distribution, though as mentioned above, ANOVA typically yields reasonably accurate p values, even when the normality assumption is violated. The larger concern is confirming that the data does not appear to be thick-tailed or heavily skewed, which in this case it does not. The more important assumption within ANOVA is discussed next. However, a Dunnett C test follow-up comparisons is conducted below to account for the nonnormal distributions.

Equal Variances

Treatment group boxplots offer insight into whether group variances are equal

Reviewing the boxplots above, it appears that the groups in Dana’s study may not have equal variances. However, the graphs alone are not enough to determine whether this assumption has been violated. To fully evaluate this assumption, Dana will need to conduct a Levene’s Test for Equal Variances which is a test used to check that variances among treatment groups are equal within a set of data. What makes the Levene’s test appealing is that it does not require score be normally distributed. The steps for conducting the analysis are outlined later and used to determine if equal variances assumption has been met.

ANOVA Overview

\(\alpha\) Level Inflation

Before proceeding, with the ANOVA discussion, it is appropriate to take a moment and discuss why an ANOVA is used rather than multiple, separate t-test comparisons. As was presented by Dana, there are three groups she is interested in comparing; conducting comparisons among the three groups could be as simple as running three t-tests, right? WRONG… well not really WRONG, rather just wrong.

Every time a researcher conducts a t-test there is a chance he/she will make a Type I error. The probability for error is usually 5% (recall, this is dictated by the \(\alpha\) level which is set to .05, or 5%). However, when a researcher runs two t-tests on the same data he/she will have actually increased his/her chance of committing a Type I error, in fact, he/she has effectively doubled their \(\alpha\) level to .10 or 10%. The formula for determining the new error rate for multiple t-tests is not as simple as multiplying 5% by the number of tests, though the doubling approach offers a convenient view into the \(\alpha\) level inflation from multiple t-tests. As such, if the researcher wanted to conduct three t-tests his/her adjusted \(\alpha\) would now be 15% (well, actually, 14.3%) but the issue should be clear.

This is where an ANOVA comes in. Specifically, an ANOVA controls for \(\alpha\) inflation issue so a researcher’s Type I error remains at 5% allowing him/her to be more confident that any statistically significant result is trustworthy rather than wondering if the result is due to just the result of running lots of tests.

ANOVA is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means by examining the variances of samples that are taken. ANOVA allows a researcher to determine whether the differences between samples are simply due to random error (sampling errors) or whether there are systematic treatment effects associated with differences in one group from the mean in another.

Most of the time ANOVA is used to compare the equality of three or more means (as is being done with Dana’s example discussed above); however, when the means from two samples are compared using ANOVA, the result is equivalent to using a t-test to compare the means of independent samples.

Note

If an ANOVA is used to compare two means using a statistical package, an error may be generated indicating the follow-up comparisons cannot be conducted but can be discarded as a significant difference for two means should be readily apparent.

ANOVA Components

The equation for a one-way ANOVA is presented below.

\[\text{One-Way ANOVA}\\~\\F =\frac{\text{Mean Squares Treatment}_{(MST)}}{\text{Mean Squares Error}_{(MSE)}} \]

Looking at the equation, one should note that ANOVA is a ratio that compares variance (or variation) between the data samples to variation within each particular sample (group). With the goal of determining whether the between variation is larger than within variation in which case the F ratio will be greater than 1 and indicate that the means of groups within the data will not be equal. If the between and within variations are approximately the same size, then there will be no significant difference between sample means in which case the F ratio will be equal to 1—and in rare cases, or when there are data collection issues, slightly less than 1.

Breaking ANOVA Into Its Component Parts

As discussed earlier, the goal of ANOVA is to partition the total variation of data into a portion due to random error \(\text{Mean Squares Error}_\text{MSE}\) and a portion due to changes in the values of the independent variable\(\text{Mean Squares Treatment}_\text{MST}\). To do this in ANOVA the variation in the response measurements is partitioned into two components which are, unsurprisingly, known as Error and Treatment, or more precisely \(\text{Sum of Squares}_{\text{Error}}\) and \(\text{Sum of Squares}_{\text{Treatment}}\)—more commonly referred to as \(SS_{\text{Error}}\) and \(SS_{\text{Treatment}}\)—which sum together to produce \(\text{Sum of Squares}_{\text{Total}}\) (\(SS_{\text{Total}}\)).

The equation below illustrates how \(SS_{\text{Total}}\) is partitioned into \(SS_{\text{Treatment}}\) and \(SS_{\text{Error}}\).

\[\begin{array}{ccccc} SS_{\text{Total}} & = & SS_{\text{Treatment}} & + & SS_\text{Error} \\ & & & & \\ \sum\limits_{i\text{ = 1}}^{k}\sum\limits_{j\text{ = 1}}^{n}(Y_{ij} - \overline Y_{\huge\cdot\cdot})^2 & = & \sum\limits_{i\text{ = 1}}^{k}n_{i}(Y_{i \huge\cdot} - \overline Y_{\huge\cdot\cdot})^2 & + & \sum\limits_{i\text{ = 1}}^{k}\sum\limits_{j\text{ = 1}}^{n}(Y_{ij} - \overline Y_{i\huge\cdot})^2 \ \end{array}\]

Equation Components

Moving from left to right, the equation components are outlined below.

\(k\) is the number of treatments; in Dana’s research \(k = 3\).
\(n\) is the number of observations within a treatment group.
\(Y_{ij}\) are the individual scores within a set of data; Dana’s total set of data includes 30 separate scores.
\(Y_{i\huge\cdot}\) are the individual scores within a given treatment; each group in Dana’s study had 10 separate observations.
Each \(n_{i}\) is the number of observations for treatment while \(N\) the total number of observations within the data. Reviewing Dana’s data, she has 3 groups, each with 10 which means that \(n_{i}\) is equal to 10 and Dana’s \(N\) is equal to 30.
\(\overline Y_{i\huge{\cdot}}\) are the means for each group within a set of data; as mentioned earlier, Dana’s data has 3 groups, which means that she will have 3 separate group means.
\(\overline Y_{\huge\cdot\cdot}\) is the grand or overall mean; in Dana’s research the grand/total mean was -0.2.
\(j\) is used to indicate separate groups; Dana has three separate groups.

FYI Given that Dana’s group are equal in size, the additional \(j\) subscripting is not necessary; however, if group sizes did differ, it would be necessary, otherwise, the computation would not work properly.

Compute Sum of Squares

Sum Squares Total

\[ SS_\text{Total}\\\sum\limits_{i\text{ = 1}}^{k}\sum\limits_{j\text{ = 1}}^{n}(Y_{ij} - \overline Y_{\huge\cdot\cdot})^2 \]

Focusing on the first portion of the Sum of Squares equation, the \(SS_\text{Total}\), presented above, is computed by subtracting the Grand Mean, found to be -0.2, from each score—regardless of group affiliation—in Dana’s study and then squaring each difference. Once that has been completed for all scores, the squared differences are then summed together (hence the name, \(SS_{Total}\)).

As mentioned earlier, Dana’s set of data includes information for 30 individuals. Rather than walk through each of the 30 calculations, the first difference will be computed followed by a table that presents each of the squared differences for Group 1 participants followed by a table that aggregates \(SS_{Total}\) by treatment group which sums each group’s Sum of Squares together to arrive at \(SS_{Total}\).

Taking the first observation from Dana’s set of data, 12, and subtracting the grand mean of -0.2 results in a difference of 12.2 that when squared is 148.84. This score can be seen in the squared differences for the Group 1 table below.

Sum of Squares Total Calculation for Group 1
Difference Score	Grand Mean	SS Total
12	-0.2	148.84
-2	-0.2	3.24
9	-0.2	84.64
3	-0.2	10.24
3	-0.2	10.24
0	-0.2	0.04
3	-0.2	10.24
2	-0.2	4.84
4	-0.2	17.64
1	-0.2	1.44
Group 1 SS Total		291.4
Note:
SS Total computed as the difference between each score and the grand mean squared.

The table above presents the sums of squares for each score within Group 1. Repeating the steps for Groups 2 and 3 results in the table presented below.

Sum of SS Total Across Groups
Treatment Group	SS Total
1	291.4
2	185
3	302.4
SS Total	778.8
Note:
SS Total is highlighted above.

Contained in the table above is the \(SS_{Total}\) for Dana’s study, which was found to be 778.8.

Sum of Squares Treatment

\[SS_\text{Treatment}\\\sum\limits_{i\text{ = 1}}^{k}n_{i}(Y_{i \huge\cdot} - \overline Y_{\huge\cdot\cdot})^2\]

Focusing on the next portion of the Sum of Squares equation, the \(SS_\text{Treatment}\), presented above, is computed by subtracting the Grand Mean from each Group’s mean, squaring each difference and then multiplying the resulting product by the Group’s number of cases. Once that has been completed for group mean score, the squared differences are then summed together to determine \(SS_{\text{Treatment}}\) for Dana’s set of data.

Taking the Placebo group’s mean score of 3.5 and subtracting the grand mean -0.2, squaring the difference 13.69 and then multiplying by 10 results in a squared difference of 136.9. This score can be seen in the squared differences for Group 1 below.

Sum of SS Treatment Across Groups
Treatment Group	Group N	Group Mean	Group SS Treatment
1	10	3.5	136.9
2	10	-2.1	36.1
3	10	-2	32.4
SS Treatment			205.4
Note:
SS Treatment is highlighted above.

At this point, enough information has been computed such that \(SS_{\text{Error}}\) could be inferred by subtracting \(SS_{\text{Treatment}}\), 205.4, from \(SS_{\text{Total}}\), 778.8, which would result in 573.4. However, rather than conclude the walkthrough here, steps for how to compute \(SS_{\text{Error}}\) are presented below.

Sum of Squares Error

\[ SS_\text{Error}\\\sum\limits_{i\text{ = 1}}^{k}\sum\limits_{j\text{ = 1}}^{n}(Y_{ij} - \overline Y_{i\huge\cdot})^2 \]

Focusing on the last portion of the Sum of Squares equation presented earlier, \(SS_\text{Error}\) is computed by subtracting each Group’s score from each Group’s mean and squaring each difference. Once that has been completed for all scores, the squared differences are then summed together to produce \(SS_{\text{Error}}\) for each Group in Dana’s study. Once that has been completed for each group, the squared differences are then summed together to determine \(SS_{\text{Error}}\) for Dana’s study.

Similar to \(SS_{\text{Total}}\) above, the first difference will be computed followed by a table that presents each of the squared differences for Group 1 participants followed by a table that aggregates \(SS_\text{Error}\) by treatment group which then sums them together to arrive at \(SS_\text{Error}\) for Dana’s study.

Taking the Placebo group mean 3.5 and subtracting the first difference score of 12 from the group mean results in a mean difference of 8.5 which when squared results in a squared difference of 72.25. This score can be seen in the table below.

Sum of Squares Error Calculation for Group 1
Difference Score	Group Mean	SS Error
12	3.5	72.25
-2	3.5	30.25
9	3.5	30.25
3	3.5	0.25
3	3.5	0.25
0	3.5	12.25
3	3.5	0.25
2	3.5	2.25
4	3.5	0.25
1	3.5	6.25
Group 1 SS Error		154.5
Note:
SS Error computed as the difference between each score and the grand mean squared.

The table above presents the sums of squares for each score within Group 1. Repeating the steps for Groups 2 and 3 results in the table presented below.

Sum of SS Error Across Groups
Treatment Group	SS Error
1	154.5
2	148.9
3	270
SS Error	573.4
Note:
SS Error is highlighted above.

Contained in the table above is the \(SS_{Error}\) for Dana’s study, which was found to be 573.4.

Revisiting Sum of Squares Equation

The equation below has been updated to show the sum of square values that were calculated earlier to show how the variance in Dana’s study has been partitioned into \(SS_{\text{Treatment}}\) and \(SS_{\text{Error}}\).

Now that \(\text{SS}_{\text{Total}}\), \(\text{SS}_{\text{Treatment}}\), and \(\text{SS}_{\text{Error}}\) have been determined, time to conduct the ANOVA analysis.

Compute the ANOVA Analysis

As presented earlier, the ANOVA equation is:

\[\text{One-Way ANOVA}\\~\\F =\frac{\text{Mean Squares Treatment}_{(MST)}}{\text{Mean Squares Error}_{(MSE)}} \]

\[\text{Mean Squares Treatment}_{(MST)} = \frac{\text{SS}_{\text{Treatment}}}{k - 1} \]

\(k\) is the number of groups in Dana’s study, which is 3.

\[\text{Mean Squares Treatment}_{(MST)} = 102.7\]

\[\text{Mean Squares Error}_{(MSE)} = \frac{\text{SS}_{\text{Error}}}{N - k} \]

\(k\) is the number of groups in Dana’s study, which is 3.
\(N\) is the total number of participants in Dana’s study, which is 30.

\[\text{Mean Squares Error}_{(MSE)} = 21.24\]

\(\text{Mean Squares}_\text{Treatment}\) = 102.7
\(\text{Mean Squares}_\text{Error}\) = 21.24

Plugging the Mean Squares information into the ANOVA equation results in:

\[ F = \frac{\text{Mean Squares Treatment}_{(MST)}}{\text{Mean Squares Error}_{(MSE)}} \]
\[ F = \frac{102.7}{21.24} \]

\[ F = 4.836 \]

Levene’s Test for Equal Variances

The variances of the dependent variable are the same for all populations, and, when boiled down, suggests that the variances within each group are relatively equal to one another, that is no one groups has a sample variance larger than twice the size of the variance of another treatment group. To evaluate whether equal variances can be assumed, a Levene Test for Equality of Variances (Levene’s test) is conducted. Computationally, the process is for conducting a Levene’s is similar to the steps outlined above when computing an ANOVA; though when computing the \(SS_{Total}\) for the Levene’s rather than square the mean differences for each group, the absolute value is applied to the mean differences. The process should become clearer below.

Create Mean Absolute Difference Score

Taking the first observation from Group 1 within Dana’s set of data, 12, and subtracting Group 1’s mean of 3.5 from participant one’s score and then taking the absolute value of the difference results in an absolute difference of 8.5. Doing this for all participants in Dana’s set of data results in a new variable called Absolute Difference.

Levene’s \(SS_{\text{Total}}\)

Once Absolute Difference has been created, the Grand Mean of the Absolute Difference, found to be 3.47, is subtracted from each Absolute Difference score—regardless of group affiliation—in Dana’s study and then squared. Once completed for all scores, the squared differences are then summed together to create \(SS_{Total}\) for the Levene’s test.

Rather than walk through each of the remaining 29 calculations in Dana’s set of data, the table below presents information for Group 1 participants followed by a table that aggregates \(SS_{Total}\) by treatment group which then sums together each group’s Sum of Squares together to arrive at \(SS_{Total}\) for Levene’s test.

Levene's Test SS Total for Group 1
Difference Score	Group Mean	Absolute Difference	Squared Mean Difference
12	3.5	8.5	25.3
-2	3.5	5.5	4.12
9	3.5	5.5	4.12
3	3.5	0.5	8.82
3	3.5	0.5	8.82
0	3.5	3.5	0
3	3.5	0.5	8.82
2	3.5	1.5	3.88
4	3.5	0.5	8.82
1	3.5	2.5	0.94
Group 1 SS Total			73.64
Note:
SS Total computed as the difference between each absolute difference score and the grand mean of absolute difference which was 3.47.

The table above presents \(SS_{\text{Total}}\) for each score within Group 1. Repeating the steps for Groups 2 and 3 results in the table presented below.

Levene's Test SS Total Across Groups
Treatment Group	SS Total
1	73.64
2	68.04
3	71.16
SS Total	212.84

Levene’s \(SS_{\text{Treatment}}\)

As done in the earlier ANOVA analysis, once Levene’s \(SS_{\text{Total}}\) has been determined, Levene’s \(SS_{\text{Treatment}}\) should be computed. Levene’s \(SS_{\text{Treatment}}\) is computed by subtracting the Grand Mean of the Absolute Difference from each Group’s Mean Absolute Difference, squaring each difference and then multiplying the resulting product by the Group’s number of cases. Once that has been completed for each treatment group’s mean score, the squared differences are then summed together to determine Levene’s \(SS_{\text{Treatment}}\) for Dana’s set of data.

Taking the Placebo group’s mean absolute difference score of 2.9 and subtracting the grand mean of absolute difference -0.57, squaring the difference 0.32 and then multiplying by 10 results in a squared difference of 3.2. This score can be seen in the squared differences for Group 1 below.

Levene's Test SS Treatment
Treatment Group	Group N	Group Mean	Group SS Treatment
1	10	2.9	3.2
2	10	2.9	3.2
3	10	4.6	12.8
SS Treatment			19.2
Note:
SS Treatment computed as the difference between each group mean of absolute difference score and the grand mean of absolute difference which was 3.47.

Levene’s \(SS_{\text{Error}}\)

As mentioned earlier in the ANOVA analysis, after determining \(SS_{\text{Total}}\) and \(SS_{\text{Treatment}}\), \(SS_\text{Error}\) can be inferred. Therefore, rather than work through the steps for computing \(SS_\text{Error}\) for a second time the \(SS_\text{Total}\) will be reworked to solve for \(SS_\text{Error}\).

Specifically,

\[SS_{\text{Total}} = SS_{\text{Treatment}} + SS_\text{Error}\]

can be reworked such that

\[SS_{\text{Error}} = SS_{\text{Total}} - SS_\text{Treatment}\]

With the adjusted formula in mind, input the known SS information to solve for \(SS_{\text{Error}}\).

\[\begin{array}{ccccc} SS_{\text{Error}} & = & SS_{\text{Total}} & - & SS_\text{Treatment} \\ SS_{\text{Error}} & = & 212.84 & - & 19.2 \\ SS_{\text{Error}} & = & 193.64 & & \end{array}\]

ANOVA Analysis for Levene’s

Similarly here, since the steps for conducting an ANOVA analysis have been discussed in detail previously, they will be omitted from this portion of the walkthrough. The equation below illustrates the \(\text{Mean Squares Treatment}_{(MST)}\) and \(\text{Mean Squares Error}_{MSE}\) inputs used to compute the Levene’s test.

\[ \text{Levene's Test}\\~\\F = \frac{9.6}{7.17} \]

\[ \text{Levene's Test}\\~\\F = 1.34 \]

ANOVA Conclusion

The observed ANOVA appears to be significant at the p < .05 since the computed F value of 4.836 is larger than the critical F value of 3.354. Follow-up comparisons should be conducted to determine where differences within the group may lie.

To report the output of the significant analysis, findings should be presented as: F(2,27) = 4.836, p < .05.

Furthermore, the computed Levene's test F value 1.34 is smaller it’s critical F value, 3.35, indicating it is nonsignificant with p > .05. As a result, equal variances can be assumed.

Follow-Up Comparison - Tukey Honestly Significant Difference (HSD)

Description

Tukey’s HSD follow-up comparison should be conducted after an ANOVA analysis results in a significant difference.

Note

Tukey’s HSD was designed for a situation with equal sample sizes per group, but can be adapted to unequal sample sizes as well—in this instance the simplest adaptation uses the harmonic mean, or reciprocal mean of n-sizes as n—though Dana’s study has equal treatment groups, so there is no need to compute a harmonic mean.

The Tukey HSD equation is presented below.

\[ \text{Tukey}_{HSD} = q \sqrt{\frac{\text{Mean Squares Error}_{(MSE)}}{N}} \]

\(q\) is a critical value of the studentized range statistic and is determined using a \(q\) reference table of critical values. For Dana’s study, \(q\) is 3.51.
\(n\) is the number of observations within a treatment group.

\[ \text{Tukey}_{HSD} = q \sqrt{\frac{\left(\text{Mean Squares Error}_{(MSE)}\right)}{N}} \]

\[ \text{Tukey}_{HSD} = 2.95 \]

Furthermore, 95% Confidence Intervals for \(\text{Tukey}_{HSD}\) can be done using the following equation.

\[ Y_i - Y_j \pm q_{\alpha,k,N-k} \sqrt{\left(\frac{MSE}{n_i}\right)} \]

\(n_{i}\) is the number of observations within each mean group under investigation. The 95% Confidence Intervals for each difference is presented in the table below.

Tukey HSD Follow-Up Comparison Results
Group 1	Group 2	Mean Difference (G.1 - G.2)	95% CI Lower Bound	95% CI Upper Bound
Placebo (3.5)	Low Dose (-2.1)	*5.6 - significant	0.49	10.71
Placebo (3.5)	High Dose (-2)	*5.5 - significant	0.39	10.61
Low Dose (-2.1)	High Dose (-2)	-0.1 - nonsignificant	-5.21	5.01
¹ Differences in the table marked by an asterisk are significant at p < .05.
² The MSE error term = 21.24 and Tukey critical value = 2.95

Follow-Up Comparison - Dunnett C Test

As mentioned earlier group variances within Dana’s set of data didn’t appear to be normal. These concerns along with a slightly elevated Levene’s test output—which isn’t enough to invalidate the test finding, may still warrant a concern—Dana may decide to conduct follow-up comparisons that do not assume equal variances are present within her data. A common follow-up comparison for a scenario similar to Dana’s is the Dunnett C test. The Dunnett C test, unlike Tukey HSD, doesn’t require equal variances among the groups within a set of data. Additionally, the Dunnett C test doesn’t require pairwise comparisons for all treatment groups within a set of data. Instead it holds one group constant (control group) and compares all other groups to it. This means that the Dunnett C test can only be used in situations where there is a control group.

The generic formula for the Dunnett C test is presented below along with the findings of the test for Dana’s study.

\[ \text{Dunnett C}\\~\\ D_{Dunnett} = t_{Dunnett}\sqrt{\frac{2 \cdot MSE}{n}}\]

Dunnett C Follow-Up Comparison Results
Group 1	Group 2	Mean Difference (G.1 - G.2)	95% CI Lower Bound	95% CI Upper Bound
Low Dose	Control	*5.6 - significant	-10.41	-0.79
High Dose	Control	*5.5 - significant	-10.31	-0.69
¹ Differences in the table marked by an asterisk are significant at p < .05.
² The MSE error term = 21.24 and Dunnett critical value = 2.33

The outcomes of the Dunnett C test are presented in the table above. As one can see, the results appear to support the findings of Dana’s Tukey’s HSD test presented earlier. At this point, Dana has found additional support for the findings of her study.

Note

The Dunnett C outcome presented above uses the Dunnett’s original single-step test procedure which differences from the conclusion presented in the Green and Salkind textbook. The reason for the difference appears to use a slight variation of the Dunnett C test (possibly a ‘step-up’ or ‘step-down’ derivative) that results in a slightly larger confidence interval that overlaps zero for the High Dose and Placebo group difference—don’t worry about understanding the differences in methods at this point, though if interested, readers should review (Dunnett, C & Tamhane, A.C., 1992). Neither approach is wrong, per se. Instead, when presenting output researchers should be diligent in reporting, clearly, the types of tests used so that readers are able to understand, and even replicate the research themselves.

Effect Size

Eta square (\(\eta^2\)) is an effect size metric that ranges from 0 to 1. A value of 0 indicates that there are no differences in the mean scores among groups while a value of 1 indicates that there are differences between at least two of the means on the dependent variable and that there are no differences on the dependent variable scores within each of the groups (i.e., perfect replication). In general, \(\eta^2\) is interpreted as the proportion of variance of the dependent variable that is related to the factor. Generally, .01, .06, and .14 are, by convention, interpreted as small, medium, and large effect sizes, respectively.

\(\eta^2\) Equation

\[ \eta^2 = \frac{SS_{\text{Treatment}}}{SS_{\text{Total}}}\]

Step 1: Input All of the Data

\[ \eta^2 = \frac{205.4}{778.8} \]

\[ \eta^2 = 0.26 \]

The resulting \(\eta^2\) is considered to be large.

A work by Alex Aguilar

aaguilar@thechicagoschool.edu