Save or print this lesson:

Lesson 3: Post-Hoc Tests

Identifying Which Groups Differ After a Significant ANOVA

Why Do We Need Post-Hoc Tests?

The Limitation of ANOVA

When ANOVA gives a significant result, we know:

"At least one population mean is different from the others."

But ANOVA does NOT tell us:

Which specific groups differ?
How many groups differ?
Which group has the highest/lowest mean?

Purpose of Post-Hoc Tests

Post-hoc tests (also called multiple comparison tests) perform pairwise comparisons to identify which specific groups differ significantly.

Important: We only conduct post-hoc tests AFTER finding a significant F-statistic in ANOVA!

Example Scenario

Suppose ANOVA comparing 4 teaching methods gives F = 8.2, p < 0.01 (significant).

We know: At least one method differs.

We DON'T know:

Is Method 1 different from Method 2?
Is Method 3 different from Method 4?
Which method is best?

Solution: Use post-hoc tests to answer these questions!

Tukey's HSD (Honestly Significant Difference)

Tukey's HSD is the most commonly used post-hoc test. It controls the family-wise error rate while comparing all possible pairs of groups.

The Tukey HSD Formula

HSD = q × √(MSW / n)

Where:

q = critical value from the Studentized Range Distribution (q-table)
MSW = Mean Square Within from ANOVA table
n = sample size per group (assumes equal sample sizes)

Decision Rule

For any two groups i and j:

If |x̄_i - x̄_j| > HSD → Groups i and j differ significantly

If |x̄_i - x̄_j| ≤ HSD → No significant difference

Finding the q-value

The q-value depends on:

k = number of groups
df_within = N - k (from ANOVA)
α = significance level (usually 0.05)

You look up q in a Studentized Range table or use statistical software.

Example: Tukey HSD for k = 3, df = 12, α = 0.05

From Studentized Range table: q = 3.77

(This is an approximate value; exact tables available in textbooks or online)

Complete Example: Tukey's HSD

Continuing from Lesson 2 Example

Recall our three study methods with:

Method 1 (Traditional): x̄₁ = 80
Method 2 (Flashcards): x̄₂ = 86
Method 3 (Practice Tests): x̄₃ = 92

ANOVA results: F = 31.75, p < 0.001 (significant)

From ANOVA: MSW = 5.67, n = 5 per group, k = 3, df_within = 12

1Find Critical q-Value

From Studentized Range table with k = 3, df = 12, α = 0.05:

q = 3.77

2Calculate HSD

HSD = q × √(MSW / n)

HSD = 3.77 × √(5.67 / 5)

HSD = 3.77 × √1.134

HSD = 3.77 × 1.065 = 4.01

3Compare All Pairs

Comparison 1: Method 1 vs Method 2

|x̄₁ - x̄₂| = |80 - 86| = 6

6 > 4.01 → Significant difference

Comparison 2: Method 1 vs Method 3

|x̄₁ - x̄₃| = |80 - 92| = 12

12 > 4.01 → Significant difference

Comparison 3: Method 2 vs Method 3

|x̄₂ - x̄₃| = |86 - 92| = 6

6 > 4.01 → Significant difference

4Summarize Results

Comparison	Difference in Means	HSD = 4.01	Conclusion
Method 1 vs 2	6	6 > 4.01	Significant
Method 1 vs 3	12	12 > 4.01	Significant
Method 2 vs 3	6	6 > 4.01	Significant

Conclusion: All three study methods produce significantly different mean exam scores. Method 3 (Practice Tests) is superior, followed by Method 2 (Flashcards), then Method 1 (Traditional).

Bonferroni Correction

The Bonferroni correction is another method for controlling Type I error in multiple comparisons. It's more conservative (stricter) than Tukey's HSD.

The Bonferroni Method

Instead of calculating a single HSD value, Bonferroni adjusts the significance level (α) for each comparison:

α_adjusted = α / c

Where:

α = original significance level (e.g., 0.05)
c = number of pairwise comparisons = k(k-1)/2

Then: Perform each pairwise comparison using a two-sample t-test with the adjusted α.

Example: Bonferroni with k = 4 Groups

Number of comparisons: c = 4(3)/2 = 6

Original α = 0.05

α_adjusted = 0.05 / 6 = 0.0083

For each of the 6 pairwise comparisons, we would use α = 0.0083 instead of 0.05.

This controls the overall family-wise error rate at 0.05.

Tukey vs Bonferroni: When to Use Which?

Method	Strengths	Best Used When
Tukey's HSD	• Most powerful when comparing all pairs • Easy to calculate and interpret • Controls family-wise error rate	• Equal sample sizes • Want to compare ALL pairs • Standard choice for ANOVA
Bonferroni	• Very simple to understand • Works with unequal sample sizes • Can use for specific comparisons	• Unequal sample sizes • Only interested in few specific comparisons • Want to be extra conservative

General Rule

Use Tukey's HSD as your default post-hoc test for ANOVA. It's the most commonly used and strikes a good balance between power and error control.

Use Bonferroni when you have unequal sample sizes or only want to test specific comparisons (not all pairs).

Important Guidelines for Post-Hoc Testing

When NOT to Do Post-Hoc Tests

If ANOVA is not significant: If F-test fails to reject H₀, STOP. Don't do post-hoc tests.
With only 2 groups: If k = 2, ANOVA is equivalent to a t-test. No post-hoc needed.
Before running ANOVA: Always do the overall F-test first!

The Proper Sequence

Run ANOVA to test if any differences exist
If ANOVA is significant: Proceed to post-hoc tests
Choose appropriate post-hoc test (usually Tukey's HSD)
Identify which specific pairs differ
Report findings with context and interpretation

Check Your Understanding

Question 1

An ANOVA comparing 4 fertilizers gives F = 2.1, p = 0.15. Should you conduct post-hoc tests? Why or why not?

Answer: NO, do not conduct post-hoc tests.

Reason: The ANOVA is NOT significant (p = 0.15 > 0.05). We failed to reject H₀, meaning we don't have evidence that any of the fertilizers differ. Post-hoc tests are only conducted AFTER finding a significant F-statistic.

Question 2

Given: k = 4 groups, MSW = 12, n = 8 per group, q = 3.96. Calculate Tukey's HSD.

Step 1: Use the formula HSD = q × √(MSW / n)

HSD = 3.96 × √(12 / 8)

HSD = 3.96 × √1.5

HSD = 3.96 × 1.225 = 4.85

Answer: HSD = 4.85

Any two group means that differ by more than 4.85 are significantly different.

Question 3

Using the HSD from Question 2, determine if groups with x̄₁ = 50 and x̄₂ = 54 differ significantly.

Step 1: Calculate the difference

|x̄₁ - x̄₂| = |50 - 54| = 4

Step 2: Compare to HSD = 4.85

4 < 4.85

Answer: No significant difference.

The difference (4) is less than HSD (4.85), so we conclude that groups 1 and 2 do not differ significantly.

Question 4

If you're comparing 5 groups, how many pairwise comparisons are there? What would the Bonferroni-adjusted α be if the original α = 0.05?

Step 1: Calculate number of comparisons

c = k(k-1) / 2 = 5(4) / 2 = 10 comparisons

Step 2: Calculate adjusted α

α_adjusted = 0.05 / 10 = 0.005

Answers:

10 pairwise comparisons
Bonferroni α = 0.005 (or 0.5%) for each test

Lesson Summary

Post-hoc tests identify which specific groups differ after significant ANOVA
Only conduct post-hoc tests if ANOVA F-test is significant!
Tukey's HSD: Most common post-hoc test
- Formula: HSD = q × √(MSW / n)
- If |x̄ᵢ - x̄ⱼ| > HSD, groups differ significantly
Bonferroni correction: Adjusts α for multiple comparisons
- α_adjusted = α / c
- More conservative than Tukey
Other tests: Scheffé (most conservative), Dunnett (control comparison)
Post-hoc tests control family-wise error rate across multiple comparisons

← Previous: One-Way ANOVA Procedure Next: Assumptions and Conditions →

Safaa Dabagh