Safaa Dabagh

Lesson 3: Post-Hoc Tests

Identifying Which Groups Differ After a Significant ANOVA

Why Do We Need Post-Hoc Tests?

The Limitation of ANOVA

When ANOVA gives a significant result, we know:

"At least one population mean is different from the others."

But ANOVA does NOT tell us:

  • Which specific groups differ?
  • How many groups differ?
  • Which group has the highest/lowest mean?

Purpose of Post-Hoc Tests

Post-hoc tests (also called multiple comparison tests) perform pairwise comparisons to identify which specific groups differ significantly.

Important: We only conduct post-hoc tests AFTER finding a significant F-statistic in ANOVA!

Example Scenario

Suppose ANOVA comparing 4 teaching methods gives F = 8.2, p < 0.01 (significant).

We know: At least one method differs.

We DON'T know:

  • Is Method 1 different from Method 2?
  • Is Method 3 different from Method 4?
  • Which method is best?

Solution: Use post-hoc tests to answer these questions!

Tukey's HSD (Honestly Significant Difference)

Tukey's HSD is the most commonly used post-hoc test. It controls the family-wise error rate while comparing all possible pairs of groups.

The Tukey HSD Formula

HSD = q × √(MSW / n)

Where:

Decision Rule

For any two groups i and j:

If |x̄i - x̄j| > HSD → Groups i and j differ significantly
If |x̄i - x̄j| ≤ HSD → No significant difference

Finding the q-value

The q-value depends on:

You look up q in a Studentized Range table or use statistical software.

Example: Tukey HSD for k = 3, df = 12, α = 0.05

From Studentized Range table: q = 3.77

(This is an approximate value; exact tables available in textbooks or online)

Complete Example: Tukey's HSD

Continuing from Lesson 2 Example

Recall our three study methods with:

  • Method 1 (Traditional): x̄₁ = 80
  • Method 2 (Flashcards): x̄₂ = 86
  • Method 3 (Practice Tests): x̄₃ = 92

ANOVA results: F = 31.75, p < 0.001 (significant)

From ANOVA: MSW = 5.67, n = 5 per group, k = 3, dfwithin = 12

1Find Critical q-Value

From Studentized Range table with k = 3, df = 12, α = 0.05:

q = 3.77

2Calculate HSD

HSD = q × √(MSW / n)
HSD = 3.77 × √(5.67 / 5)
HSD = 3.77 × √1.134
HSD = 3.77 × 1.065 = 4.01

3Compare All Pairs

Comparison 1: Method 1 vs Method 2

|x̄₁ - x̄₂| = |80 - 86| = 6
6 > 4.01 → Significant difference

Comparison 2: Method 1 vs Method 3

|x̄₁ - x̄₃| = |80 - 92| = 12
12 > 4.01 → Significant difference

Comparison 3: Method 2 vs Method 3

|x̄₂ - x̄₃| = |86 - 92| = 6
6 > 4.01 → Significant difference

4Summarize Results

Comparison Difference in Means HSD = 4.01 Conclusion
Method 1 vs 2 6 6 > 4.01 Significant
Method 1 vs 3 12 12 > 4.01 Significant
Method 2 vs 3 6 6 > 4.01 Significant

Conclusion: All three study methods produce significantly different mean exam scores. Method 3 (Practice Tests) is superior, followed by Method 2 (Flashcards), then Method 1 (Traditional).

Bonferroni Correction

The Bonferroni correction is another method for controlling Type I error in multiple comparisons. It's more conservative (stricter) than Tukey's HSD.

The Bonferroni Method

Instead of calculating a single HSD value, Bonferroni adjusts the significance level (α) for each comparison:

αadjusted = α / c

Where:

Then: Perform each pairwise comparison using a two-sample t-test with the adjusted α.

Example: Bonferroni with k = 4 Groups

Number of comparisons: c = 4(3)/2 = 6

Original α = 0.05

αadjusted = 0.05 / 6 = 0.0083

For each of the 6 pairwise comparisons, we would use α = 0.0083 instead of 0.05.

This controls the overall family-wise error rate at 0.05.

Tukey vs Bonferroni: When to Use Which?

Method Strengths Best Used When
Tukey's HSD • Most powerful when comparing all pairs
• Easy to calculate and interpret
• Controls family-wise error rate
• Equal sample sizes
• Want to compare ALL pairs
• Standard choice for ANOVA
Bonferroni • Very simple to understand
• Works with unequal sample sizes
• Can use for specific comparisons
• Unequal sample sizes
• Only interested in few specific comparisons
• Want to be extra conservative

General Rule

Use Tukey's HSD as your default post-hoc test for ANOVA. It's the most commonly used and strikes a good balance between power and error control.

Use Bonferroni when you have unequal sample sizes or only want to test specific comparisons (not all pairs).

Other Post-Hoc Tests (Brief Overview)

Scheffé's Test

Dunnett's Test

When to Use Dunnett's

Testing 3 new drugs (A, B, C) against a placebo:

Dunnett's tests:

  • Drug A vs Placebo
  • Drug B vs Placebo
  • Drug C vs Placebo

Does NOT test: Drug A vs Drug B, etc.

Important Guidelines for Post-Hoc Testing

When NOT to Do Post-Hoc Tests

  1. If ANOVA is not significant: If F-test fails to reject H₀, STOP. Don't do post-hoc tests.
  2. With only 2 groups: If k = 2, ANOVA is equivalent to a t-test. No post-hoc needed.
  3. Before running ANOVA: Always do the overall F-test first!

The Proper Sequence

  1. Run ANOVA to test if any differences exist
  2. If ANOVA is significant: Proceed to post-hoc tests
  3. Choose appropriate post-hoc test (usually Tukey's HSD)
  4. Identify which specific pairs differ
  5. Report findings with context and interpretation

Check Your Understanding

Question 1

An ANOVA comparing 4 fertilizers gives F = 2.1, p = 0.15. Should you conduct post-hoc tests? Why or why not?

Answer: NO, do not conduct post-hoc tests.

Reason: The ANOVA is NOT significant (p = 0.15 > 0.05). We failed to reject H₀, meaning we don't have evidence that any of the fertilizers differ. Post-hoc tests are only conducted AFTER finding a significant F-statistic.

Question 2

Given: k = 4 groups, MSW = 12, n = 8 per group, q = 3.96. Calculate Tukey's HSD.

Step 1: Use the formula HSD = q × √(MSW / n)

HSD = 3.96 × √(12 / 8)
HSD = 3.96 × √1.5
HSD = 3.96 × 1.225 = 4.85

Answer: HSD = 4.85

Any two group means that differ by more than 4.85 are significantly different.

Question 3

Using the HSD from Question 2, determine if groups with x̄₁ = 50 and x̄₂ = 54 differ significantly.

Step 1: Calculate the difference

|x̄₁ - x̄₂| = |50 - 54| = 4

Step 2: Compare to HSD = 4.85

4 < 4.85

Answer: No significant difference.

The difference (4) is less than HSD (4.85), so we conclude that groups 1 and 2 do not differ significantly.

Question 4

If you're comparing 5 groups, how many pairwise comparisons are there? What would the Bonferroni-adjusted α be if the original α = 0.05?

Step 1: Calculate number of comparisons

c = k(k-1) / 2 = 5(4) / 2 = 10 comparisons

Step 2: Calculate adjusted α

αadjusted = 0.05 / 10 = 0.005

Answers:

  • 10 pairwise comparisons
  • Bonferroni α = 0.005 (or 0.5%) for each test

Lesson Summary