Save or print this lesson:

Two-Sample Tests for Means (Independent Samples)

Learn how to compare means from two independent populations

Lesson Objectives

By the end of this lesson, you will be able to:

Distinguish between independent and dependent samples
State the conditions required for a two-sample t-test
Formulate null and alternative hypotheses for comparing two means
Calculate test statistics for unpooled (Welch's) and pooled variance approaches
Conduct and interpret two-sample t-tests for independent samples

1. Independent vs. Dependent Samples

Key Definitions

Independent Samples: Two samples are independent if the observations in one sample are completely unrelated to the observations in the other sample. The samples are drawn from two separate populations.

Dependent (Paired) Samples: Two samples are dependent if each observation in one sample is naturally paired or matched with an observation in the other sample.

Examples of Independent Samples

Comparing average heights of randomly selected men vs. randomly selected women
Comparing test scores from students at School A vs. students at School B
Comparing recovery times for patients receiving Treatment A vs. patients receiving Treatment B (different patients in each group)
Comparing salaries in California vs. Texas

Examples of Dependent Samples

Comparing blood pressure before and after medication in the same patients
Comparing test scores of twins (one twin in each group)
Comparing pre-test and post-test scores for the same students
Comparing left eye vs. right eye vision for the same individuals

Key Question: Are the same subjects (or matched pairs) measured twice, or are there two completely separate groups? Same subjects/matched pairs = dependent. Separate groups = independent.

2. The Two-Sample t-Test for Independent Samples

When we want to compare the means of two independent populations, we use a two-sample t-test (also called an independent samples t-test).

Hypotheses

The hypotheses compare the two population means μ₁ and μ₂:

Test Type	Null Hypothesis (H₀)	Alternative Hypothesis (Hₐ)
Two-tailed	H₀: μ₁ = μ₂ or H₀: μ₁ - μ₂ = 0	Hₐ: μ₁ ≠ μ₂ or Hₐ: μ₁ - μ₂ ≠ 0
Right-tailed	H₀: μ₁ = μ₂	Hₐ: μ₁ > μ₂ or Hₐ: μ₁ - μ₂ > 0
Left-tailed	H₀: μ₁ = μ₂	Hₐ: μ₁ < μ₂ or Hₐ: μ₁ - μ₂ < 0

Conditions for Two-Sample t-Test

Before conducting the test, verify these conditions:

Independence:
- The two samples are independent of each other
- Observations within each sample are independent (random sampling)
- Each sample size is less than 10% of its population (if sampling without replacement)
Normality:
- Both populations are normally distributed, OR
- Both sample sizes are large (n₁ ≥ 30 AND n₂ ≥ 30)
- If sample sizes are small, check for major skewness or outliers

Important: If the samples are NOT independent (e.g., before/after measurements, matched pairs), you CANNOT use a two-sample t-test. You must use a paired t-test instead (Lesson 2).

3. Test Statistic: Unpooled Variance (Welch's t-test)

The most common approach is the unpooled variance method, also called Welch's t-test. This method does NOT assume that the two populations have equal variances.

Test Statistic (Unpooled Variance)

t = (x̄₁ - x̄₂) / √(s₁²/n₁ + s₂²/n₂)

where:
x̄₁, x̄₂ = sample means
s₁, s₂ = sample standard deviations
n₁, n₂ = sample sizes

Degrees of Freedom (Welch's Approximation)

The degrees of freedom calculation is complex:

df = [(s₁²/n₁ + s₂²/n₂)²] / [(s₁²/n₁)²/(n₁-1) + (s₂²/n₂)²/(n₂-1)]

In practice, use technology to calculate df. Most calculators and software use this formula automatically.

Example 1: Unpooled Two-Sample t-Test

Research Question: Does a new drug lower cholesterol more than the standard drug?

Data:

New Drug (Group 1): n₁ = 35, x̄₁ = 195 mg/dL, s₁ = 18 mg/dL
Standard Drug (Group 2): n₂ = 40, x̄₂ = 210 mg/dL, s₂ = 22 mg/dL

Significance level: α = 0.05

Step 1: State hypotheses

H₀: μ₁ = μ₂ (new drug is not more effective)
Hₐ: μ₁ < μ₂ (new drug lowers cholesterol more) — left-tailed test

Step 2: Check conditions

Independent random samples from two groups
Both n₁ ≥ 30 and n₂ ≥ 30 (large samples, CLT applies)

Step 3: Calculate test statistic

t = (195 - 210) / √(18²/35 + 22²/40)
t = -15 / √(9.257 + 12.1)
t = -15 / √21.357
t = -15 / 4.621
t ≈ -3.25

Step 4: Find degrees of freedom

Using Welch's formula or technology: df ≈ 71.8 (round to 71)

Step 5: Find p-value

For a left-tailed test with t = -3.25 and df = 71:

p-value ≈ 0.001 (from t-table or technology)

Step 6: Make decision

Since p-value (0.001) < α (0.05), we reject H₀.

Step 7: Conclusion

There is sufficient evidence at the 0.05 significance level to conclude that the new drug lowers cholesterol more than the standard drug. The mean cholesterol for the new drug group is significantly lower.

4. Test Statistic: Pooled Variance Approach

If you have reason to believe that the two populations have equal variances (σ₁² = σ₂²), you can use the pooled variance approach. This combines the sample variances into a single estimate.

Pooled Variance

sp² = [(n₁ - 1)s₁² + (n₂ - 1)s₂²] / (n₁ + n₂ - 2)

sp² = pooled variance estimate

Test Statistic (Pooled Variance)

t = (x̄₁ - x̄₂) / (sp√(1/n₁ + 1/n₂))

Degrees of freedom: df = n₁ + n₂ - 2

When to Use Pooled vs. Unpooled?

Pooled: Use when you have strong evidence that σ₁² = σ₂² (e.g., experimental design ensures equal variances)
Unpooled (Welch's): Use when variances may be unequal OR when you're unsure. This is the safer, more conservative choice and is the default in most statistical software.

Example 2: Pooled Two-Sample t-Test

Research Question: Do students learn better with Method A or Method B?

Data:

Method A: n₁ = 25, x̄₁ = 82, s₁ = 8
Method B: n₂ = 30, x̄₂ = 78, s₂ = 7
Assume equal population variances (controlled experimental design)

Significance level: α = 0.05, two-tailed test

Step 1: Hypotheses

H₀: μ₁ = μ₂ (no difference in learning)
Hₐ: μ₁ ≠ μ₂ (there is a difference)

Step 2: Calculate pooled variance

sp² = [(25-1)(8²) + (30-1)(7²)] / (25 + 30 - 2)
sp² = [24(64) + 29(49)] / 53
sp² = [1536 + 1421] / 53
sp² = 2957 / 53 ≈ 55.79
sp ≈ 7.47

Step 3: Calculate test statistic

t = (82 - 78) / (7.47√(1/25 + 1/30))
t = 4 / (7.47√0.0733)
t = 4 / (7.47 × 0.2707)
t = 4 / 2.022
t ≈ 1.98

Step 4: Degrees of freedom

df = 25 + 30 - 2 = 53

Step 5: Critical value and decision

For α = 0.05 (two-tailed) and df = 53: t-critical ≈ ±2.006

Since |1.98| < 2.006, we fail to reject H₀.

Step 6: Conclusion

There is insufficient evidence at the 0.05 significance level to conclude that the two teaching methods produce different average test scores.

5. Decision Methods: Critical Value vs. p-Value

Critical Value Approach

Calculate the test statistic t
Find the critical value(s) from the t-table based on α and df
Compare:
- Two-tailed: Reject H₀ if |t| > t-critical
- Right-tailed: Reject H₀ if t > t-critical
- Left-tailed: Reject H₀ if t < -t-critical

p-Value Approach

Calculate the test statistic t
Find the p-value using technology (or approximate using t-table)
Compare: Reject H₀ if p-value < α

Both methods give the same conclusion! The p-value approach is more informative because it tells you exactly how strong the evidence is against H₀.

Check Your Understanding

Question 1: A researcher wants to compare average daily screen time for teenagers vs. adults. She randomly surveys 50 teenagers and 60 adults. Is this an independent or dependent samples design?

Answer: Independent samples. The teenagers and adults are two separate, unrelated groups. Each person is only in one group.

Question 2: Two samples have n₁ = 40, s₁ = 12 and n₂ = 35, s₂ = 15. Would you use pooled or unpooled variance?

Answer: Unpooled (Welch's t-test). The sample standard deviations are fairly different (12 vs 15), and there's no indication that the population variances are equal. Unpooled is the safer, more conservative choice.

Question 3: In a two-sample t-test, you get t = 2.8 with df = 45. For a two-tailed test at α = 0.05, the critical value is ±2.014. What is your decision?

Answer: Reject H₀. Since |2.8| = 2.8 > 2.014, the test statistic falls in the rejection region. There is sufficient evidence to conclude the two population means are different.

Key Takeaways

Two-sample t-tests compare means from two independent populations
Check independence: different groups, not paired or matched
Unpooled (Welch's) t-test is the default and safer choice
Pooled t-test requires equal population variances assumption
Hypotheses: H₀: μ₁ = μ₂ vs. Hₐ: μ₁ ≠ μ₂ (or <, >)
Always check conditions before conducting the test
Interpret results in the context of the research question

← Back to Module 9 Next: Lesson 2 (Paired Samples) →