Independent-Samples t Test


Research Scenario

Question

Bart is interested in determining if individuals talk more when they are nervous. To evaluate his hypothesis, he obtains 30 volunteers from undergraduate classes to participate in his experiment. He randomly assigns 15 students to each of two groups. Next he schedules separate appointments for each student. When students in the first group arrive for their separate appointments, they are told that in 10 minutes they will be asked to describe how friendly they are by completing a true-false personality measure (low-stress). In contrast, students in the second group (high-stress) are told that in 10 minutes they will be asked a set of questions by a panel of measurement specialists to determine how friendly they are. In both groups, the conversations between then students and the experimenter are recorded for 10 minutes. During this 10-minute period, the experimenter makes prescribed comments if no conversation occurs within a 1-minute time interval and respond to student questions with a brief response such as yes or no. The percent of time that each student talks during the 10-minute period is recorded.

A grouping variable (independent variable), labeled stress distinguishes between low-stress (participants who believe they are going to complete a personality measure) and high-stress (participants who anticipate questions from a panel of measurement experts) groups while the test (dependent variable) is the percent of time that each student talks during the 10-minute period.

Instrument & Scoring

There is no defined instrument used; rather, the percent of time each student talks during the 10-minute period is recorded.

Null Hypothesis

Bart is interested in determining whether the average amount of talking differs under low-stress versus high-stress conditions. The null hypothesis for this test would be that there is no difference in percent of talking time among groups. Additionally, since the research question is non-directional, a two-tailed p value is used.

Assumptions

  1. The test variable is normally distributed in each of the two populations (as defined by the grouping variable).
  2. The variances of the normally distributed test variable for populations are equal.
  3. The cases represent a random sample from the population, and the scores on the test variable are independent of each other.

Check

Of the three major assumptions for the independent-samples t test, only the first two can be evaluated using data alone. The third assumption is something that is best addressed through implementing proper practices/sampling practices–simply put, if this assumption is found to be violated in the data analysis phase there is little the researcher can do to address the issue.

The Test Variable is Normally Distributed in Each of the Two Populations

To evaluate the first assumption of normality, develop a histogram of the observed scores. Furthermore, and if desired, conduct skewness and kurtosis analyses.

Low Stress Group

Low Stress Group

Skewness: -0.33
Kurtosis: -1.33

High Stress Group

High Stress Group

Skewness: 1.22
Kurtosis: 0.56

Conclusion

Reviewing the histograms along with the skewness and kurtosis values presented above for both stress conditions, indicates that the high stress distribution of scores may show a slight degree of of positive skew because its value is greater than 1. This shouldn’t be an issue as each group contains at least 15 cases which, in many cases, is sufficiently large enough to yield fairly accurate p values. Additionally, some researchers (Hair et al. (2010) and Bryne (2010)) have suggested that skewness values of +/- 2 and kurtosis values of +/- 7 are acceptable for approximating a normal distribution.

Variances of the Test Variable for Populations are Equal

To evaluate the equal variances analysis, a Levene’s Test for Equality of Variances should be conducted. If the result is found to be nonsignificant (p > .05), equal variances can be assumed; if, however, the Levene’s test is significant (p < .05), then equal variances CANNOT be assumed, and a Welche’s t test using the Welch–Satterthwaite Equation for Estimated Degrees of Freedom must be conducted.

Conclusion

Conducting a Levene’s yields an F value of 0.03 with p-value of 0.88 which is much larger than .05 and therefore considered nonsignificant. Since the Levene’s Test was nonsignificant, equal variances may be assumed.

Note

The steps for how to compute the Levene’s Test were not included; this is an already somewhat dense walkthrough, but will be addressed in the One-Way ANOVA walkthrough materials.

Also, while equal variances may be assumed in Bart’s research study, keep reading past the first t test to see how a Welch’s t test with Welch–Satterthwaite Estimated Degrees of Freedom are computed when equal variances cannot be assumed; it is good practice to understand both approaches, especially since some researchers may opt to not assume equal variances—it’s one less thing to worry about—and is useful/necessary when presented with unequal group size.


Independent Samples t Test Overview

The Independent Samples t Test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different.

This test is also known as:

  • Independent t Test
  • Independent Measures t Test
  • Independent Two-sample t Test
  • Student t Test
  • Two-Sample t Test
  • Uncorrelated Scores t Test
  • Unpaired t Test
  • Unrelated t Test

Equal Variances Assumed

Independent Samples t Test Overall Equation

\[ t = \frac{\overline{X}_{X} - \overline{X}_{Y}}{\sqrt{\left[\frac{\left(\Sigma X^2 - \frac{(\Sigma X)^2}{N_{X}}\right) + \left(\Sigma Y^2 - \frac{(\Sigma Y)^2}{N_{Y}}\right)}{(N_{X} + N_{Y}) - 2}\right] \cdot \left[\frac{1}{N_X} + \frac{1}{N_{Y}}\right]}} \]

Simplified Equation

\[ t = \frac{\text{Mean Talk Time for Low Stress} - \text{Mean Talk Time for High Stress}}{\sqrt{\left[\frac{\left(\text{Sum of Squared Talk Time for Low Stress} - \frac{(\text{Sum of Talk Time for Low Stress})^2}{N_{\text{Low Stress}}}\right) + \left(\text{Sum of Squared Talk Time for High Stress} - \frac{(\text{Sum of Talk Time for High Stress})^2}{N_{\text{High Stress}}}\right)}{(N_{\text{Low Stress}} + N_{\text{High Stress}}) - 2}\right] \cdot \left[\frac{1}{N_\text{Low Stress}} + \frac{1}{N_{\text{High Stress}}}\right]}} \]

Data Needed to Complete the Independent Samples t Test - Assuming Equal Variances

To conduct an independent samples t test six pieces of information are required, which Bart will need to compute.

  • mean percent talk time for low stress (\(\overline{X}_{X}\)) = ?
  • sample size for low stress (\(N_{\text{Low Stress}}\)) = ?
  • sum of squared percent talk time for low stress (\(\Sigma X^2\)) = ?
  • mean percent talk time for high stress (\(\overline{X}_{Y}\)) = ?
  • sample size for high stress (\(N_{\text{High Stress}}\)) = ?
  • sum of squared percent talk time for high stress (\(\Sigma Y^2\)) = ?
Group Sample Size Mean Percent Talk Standard Deviation Talk Sum of Squares
Low Stress 15 45.20 24.97 39374
High Stress 15 22.07 27.14 17613
Note:
Standard deviation is included for each grouping, but will be used when not assuming equal variances in percent talk time.

Using the table above, Bart has gathered the outstanding pieces of information needed to conduct the independent samples t test:

  • mean percent talk time for low stress (\(\overline{X}_{X}\)) = 45.2
  • sample size for low stress (\(N_{\text{Low Stress}}\)) = 15
  • sum of squared percent talk time for low stress (\(\Sigma X^2\)) = 39374
  • mean percent talk time for high stress (\(\overline{X}_{Y}\)) = 22.07
  • sample size for high stress (\(N_{\text{High Stress}}\)) = 15
  • sum of squared percent talk time for high stress (\(\Sigma Y^2\)) = 17613

Step 1: Input All of the Data

\[ t = \frac{45.2 - 22.07}{\sqrt{\left[\frac{\left(39374 - \frac{(459684)}{15}\right) + \left(17613 - \frac{(109561)}{15}\right)}{28}\right] \cdot \left[ 0.13\right]}} \]

Step 2: Simplify the Denominator

\[ t = \frac{23.13}{\sqrt{\left[\frac{19037.33}{28}\right] \cdot \left[ 0.13\right]}} \]

Step 3: Simplify the Numerator and Denominator

\[ t = \frac{23.13}{9.52} \]

Step 4: Round the Resulting t Value

\[ t = 2.43 \]

Step 5: Determine the Significance

To determine if the computed t value is significant and therefore, reject the null hypothesis, consult a table of critical t values. A critical value is the value that a test statistic (in that case the t statistic) must exceed in order for the the null hypothesis to be rejected. In this instance, using degrees of freedom equal to 28 with a p value of 0.05, the critical value for this t test is 2.048.

Independent Samples t Test - Equal Variances Assumed Conclusion

The observed difference appears to be significant since the computed t value of 2.43 is larger than the critical t value of 2.048.

Effect Size

An effect size is a way of quantifying the size of the difference between two groups; or in this case, a measure of magnitude of difference in percentage of talk time between students in Bart’s high and low stress conditions. To evaluate effect size, use the Cohen’s d effect size.

Cohen’s d Overall Equation
\[ d = t{\sqrt{\frac{N_{X} + N_{Y}}{N_{X}\cdot N_{Y}}}} \]

Simplified Equation
\[ d = t{\sqrt{\frac{N_{\text{Low Stress}} + N_{\text{High Stress}}}{N_{\text{Low Stress}}\cdot N_{\text{High Stress}}}}} \]

Step 1: Input All of the Data
\[ d = 2.43\sqrt{\frac{30}{225}} \]
Compute Cohen’s d
\[ d = 0.89 \]

The resulting Cohen’s d is considered to be large.

Equal Variances Not Assumed

When unequal variances cannot be assumed, a Welch’s t test with approximated degrees of freedom must be computed using the Welch–Satterthwaite Equation. The steps for conducting each test are outlined below.

Data Needed to Complete the Independent Samples t Test - Not Assuming Equal Variances

To conduct an independent-samples Welch’s t test that uses computed degrees of freedom—Welch–Satterthwaite Equation for Estimated Degrees of Freedom—Bart needs five pieces of information; four of the five were already presented above in the descriptives table for each group when assuming equal variances.

  • mean percent talk time for low stress (\(\overline{X}_{X}\)) = 45.2
  • sample size for low stress (\(N_{\text{Low Stress}}\)) = 15
  • sum of squared percent talk time for low stress (\(\Sigma X^2\)) = 39374
  • standard deviation percent talk time of low stress group (\(s_{X}\)) = 24.97

  • mean percent talk time for high stress (\(\overline{X}_{Y}\)) = 22.07
  • sample size for high stress (\(N_{\text{High Stress}}\)) = 15
  • sum of squared percent talk time for high stress (\(\Sigma Y^2\)) = 17613
  • standard deviation percent talk time of high stress group (\(s_{Y}\)) = 27.14

However, unlike the t test that assumes equal variances, Bart will need to compute the degrees of variance associated with each group which is computed as group sample size less 1; since group sizes are equal the degrees of variance associated within each percent talk will be equal.

  • degrees of variance associated with low stress percent talk variance (\(\nu_{X}\)) = 14
  • degrees of variance associated with high stress percent talk variance (\(\nu_{X}\)) = 14

Welch’s t Test Overall Equation

\[ t = \frac{\overline{X}_{X} - \overline{X}_{Y}}{\sqrt{\frac{s_{X}^{2}}{N_{X}} + \frac{s_{Y}^{2}}{N_{Y}}}} \]

Simplified Equation

\[ t = \frac{\text{Mean Talk Time for Low Stress} - \text{Mean Talk Time for High Stress}}{\sqrt{\frac{\text{Standard Deviation of Talk Time for Low Stress}^2}{N_{\text{Low Stress}}} + \frac{\text{Standard Deviation of Talk Time for High Stress}^2}{N_\text{High Stress}}}} \]

Step 1: Input All of the Data

\[ t = \frac{45.2 - 22.07}{\sqrt{\frac{623.46}{15} + \frac{736.35}{15}}} \]

Step 2: Simplify the Numerator and Denominator

\[ t = \frac{23.13}{9.52} \]

Step 3: Round the Resulting t Value

\[ t = 2.43 \]

Welch–Satterthwaite Equation for Estimated Degrees of Freedom Overall Equation

\[ \nu \approx \frac{\left(\frac{s_{X}^{2}}{N_{X}} + \frac{s_{Y}^{2}}{N_{Y}}\right)^{2}} {{\frac{s_{X}^{4}}{N_{X}^{2}\cdot\nu_{X}} + \frac{s_{Y}^{4}}{N_{Y}^{2}\cdot\nu_Y}}} \]

Simplified Equation

\[ \nu \approx \frac{\left(\frac{\text{Standard Deviation of Talk Time for Low Stress}^{2}}{N_{\text{Low Stress}}} + \frac{\text{Standard Deviation of Talk Time for High Stress}^{2}}{N_{\text{High Stress}}}\right)^{2}} {{\frac{\text{Standard Deviation of Talk Time for Low Stress}^{4}}{N_{\text{Low Stress}}^{2}\cdot\nu_{\text{Low Stress}}} + \frac{\text{Standard Deviation of Talk Time for High Stress}^{4}}{N_{\text{High Stress}}^{2}\cdot\nu_\text{High Stress}}}} \]

Step 1: Input All of the Data

\[ \nu \approx \frac{\left(\frac{623.46}{15} + \frac{736.35}{15}\right)^{2}} {{\frac{388698.81}{3150} + \frac{542214.83}{3150}}} \]

Step 2: Simplify the Numerator and Denominator

\[ \nu \approx \frac{8218.14}{295.53} \]

Step 3: Round the Resulting Estimated \(\nu\) Value
\[ \nu \approx 27.81 \]

The approximated degrees of freedom for the Welch t test is 27.81.

Independent Samples t Test - Equal Variances Not Assumed Conclusion

The observed difference appears to be significant since the computed t value of 2.43 is larger than the critical t value of 2.048.

Effect Size

Cohen’s d Overall Equation
\[ d = t{\sqrt{\frac{N_{X} + N_{Y}}{N_{X}\cdot N_{Y}}}} \]

Simplified Equation
\[ d = t{\sqrt{\frac{N_{\text{Low Stress}} + N_{\text{High Stress}}}{N_{\text{Low Stress}}\cdot N_{\text{High Stress}}}}} \]

Step 1: Input All of the Data

\[ d = 2.43\sqrt{\frac{30}{225}} \]

Compute Cohen’s d

\[ d = 0.89 \]

The resulting Cohen’s d is considered to be large.

 

A work by Alex Aguilar

aaguilar@thechicagoschool.edu