Safaa Dabagh

Comprehensive Study Guide

Module 10: Analysis of Variance (ANOVA)

This study guide contains everything you need to master ANOVA:

Print this guide for exam preparation!

1. Key Concepts and Definitions

What is ANOVA?

ANOVA (Analysis of Variance) is a statistical method used to compare the means of three or more groups simultaneously.

Purpose: Test if at least one population mean differs from the others
Hypotheses:
• H₀: μ₁ = μ₂ = μ₃ = ... (all means are equal)
• Hₐ: At least one mean is different

Why ANOVA Instead of Multiple t-Tests?

α Inflation Problem:
Multiple t-tests inflate the Type I error rate. With k groups requiring k(k-1)/2 comparisons, the overall α increases dramatically.
Example: 4 groups = 6 t-tests → overall α ≈ 26.5% (not 5%)!
Solution: ANOVA performs ONE test, maintaining α = 0.05

The Basic Idea

ANOVA compares two types of variation:

If between-group variation >> within-group variation → groups differ significantly

2. Complete Formula Reference

Sum of Squares (SS)

Total Sum of Squares (SST):
SST = Σ(xij - x̄)²
Total variation of all observations from grand mean

Between-Group Sum of Squares (SSB):
SSB = Σnj(x̄j - x̄)²
Variation due to group differences

Within-Group Sum of Squares (SSW):
SSW = Σ(xij - x̄j
Variation within groups (error)

Fundamental Relationship:
SST = SSB + SSW

Degrees of Freedom (df)

dfbetween = k - 1 (k = number of groups)
dfwithin = N - k (N = total sample size)
dftotal = N - 1

Verification: dftotal = dfbetween + dfwithin

Mean Squares (MS)

MSB = SSB / dfbetween = SSB / (k-1)
Average between-group variance

MSW = SSW / dfwithin = SSW / (N-k)
Average within-group variance (pooled error)

F-Statistic

F = MSB / MSW

Interpretation:
• F ≈ 1: Groups are similar
• F >> 1: Strong evidence of group differences
• F < 1: Groups very similar (fail to reject H₀)

Tukey's HSD (Post-Hoc Test)

HSD = q × √(MSW / n)

Where:
• q = Studentized Range critical value (from table)
• MSW = Mean Square Within
• n = sample size per group (assumes equal n)

Decision Rule:
If |x̄i - x̄j| > HSD → Groups i and j differ significantly

Bonferroni Correction

αadjusted = α / c

Where:
• α = original significance level (e.g., 0.05)
• c = number of pairwise comparisons = k(k-1)/2

Use αadjusted for each individual comparison

3. ANOVA Table Template

Source of Variation Sum of Squares (SS) df Mean Square (MS) F
Between Groups SSB k - 1 MSB = SSB/(k-1) F = MSB/MSW
Within Groups SSW N - k MSW = SSW/(N-k)
Total SST N - 1

4. Complete ANOVA Procedure

Step-by-Step Guide

  1. State hypotheses
    • H₀: μ₁ = μ₂ = ... = μₖ
    • Hₐ: At least one μ is different
  2. Choose significance level (usually α = 0.05)
  3. Check assumptions (independence, normality, equal variances)
  4. Calculate group means (x̄₁, x̄₂, ..., x̄ₖ) and grand mean (x̄)
  5. Calculate sum of squares (SSB, SSW, SST)
  6. Calculate degrees of freedom (dfbetween, dfwithin)
  7. Calculate mean squares (MSB, MSW)
  8. Calculate F-statistic (F = MSB/MSW)
  9. Find p-value or critical value using F-distribution with (df₁, df₂)
  10. Make decision
    • If p ≤ α (or F ≥ Fcritical): Reject H₀
    • If p > α (or F < Fcritical): Fail to reject H₀
  11. State conclusion in context
  12. If significant: Conduct post-hoc tests to identify which groups differ

5. ANOVA Assumptions

Three Key Assumptions

Assumption How to Check What If Violated?
1. Independence
Observations are independent
• Study design
• Random assignment
• No repeated measures
• Use Repeated Measures ANOVA
• Use mixed models
• CRITICAL - must address!
2. Normality
Populations are normally distributed
• Histograms
• Q-Q plots
• Shapiro-Wilk test
• Kruskal-Wallis test
• Transform data
• OK if n ≥ 30 (CLT)
3. Equal Variances
σ₁² = σ₂² = ... = σₖ²
• Rule: max(s²)/min(s²) < 2
• Boxplots
• Levene's test
• Welch's ANOVA
• Kruskal-Wallis test
• OK if equal sample sizes
Robustness: ANOVA is fairly robust to violations of normality and equal variance when:
  • Sample sizes are large (n ≥ 30 per group)
  • Sample sizes are equal across groups
  • Violations are moderate (not severe)

6. Post-Hoc Tests Summary

Test When to Use Characteristics
Tukey's HSD • After significant ANOVA
• Equal sample sizes
• All pairwise comparisons
• Most common
• Good power
• Controls family-wise error
Bonferroni • Unequal sample sizes
• Few specific comparisons
• Need extra conservative
• Simple to use
• Very conservative
• Adjusts α for each test
Scheffé • Complex comparisons
• Maximum error protection
• Most conservative
• Least powerful
Dunnett's • Comparing treatments to control
• Not comparing treatments to each other
• More powerful than Tukey for this purpose
• Common in medical research
Important Rules:
  • Only do post-hoc tests AFTER significant ANOVA
  • Don't do post-hoc with only 2 groups
  • Choose ONE post-hoc method (don't mix)

7. Which Test Should I Use?

Situation Appropriate Test
Comparing 2 groups, means, independent Two-sample t-test
Comparing 3+ groups, means, independent, assumptions met One-way ANOVA
Comparing 3+ groups, severe non-normality, small n Kruskal-Wallis test
Comparing 3+ groups, unequal variances Welch's ANOVA
Same subjects measured multiple times Repeated Measures ANOVA
Two or more factors/independent variables Two-way ANOVA (Factorial ANOVA)
Testing relationship between categorical variables Chi-square test

8. Common Mistakes to Avoid

DON'T:

  • Use multiple t-tests when comparing 3+ groups (α inflation!)
  • Do post-hoc tests when ANOVA is not significant
  • Confuse SSB and SSW (between vs. within)
  • Forget to divide by df when calculating mean squares
  • Use regular ANOVA with repeated measures data
  • Assume ANOVA tells you WHICH groups differ (need post-hoc!)
  • Ignore assumptions - always check them!
  • Report F-statistic without degrees of freedom

DO:

  • Check all three assumptions before running ANOVA
  • Report F-statistic with both degrees of freedom: F(df₁, df₂)
  • Verify SST = SSB + SSW as a calculation check
  • Use post-hoc tests only after significant ANOVA
  • Choose post-hoc method before looking at data
  • Report effect size (not just significance)
  • Interpret results in context of the research question
  • Consider practical significance, not just statistical significance

9. Sample ANOVA Report

Example of how to report ANOVA results professionally:

A one-way ANOVA was conducted to compare the effectiveness of three study methods (traditional, flashcards, practice tests) on exam performance. Participants (N = 15) were randomly assigned to one of three groups (n = 5 per group). The assumptions of independence, normality, and equal variances were met.

The ANOVA revealed a statistically significant difference in mean exam scores among the three study methods, F(2, 12) = 31.75, p < .001. Post-hoc comparisons using Tukey's HSD indicated that all three methods produced significantly different mean scores (HSD = 4.01). Practice tests (M = 92, SD = 2.24) yielded the highest scores, followed by flashcards (M = 86, SD = 1.58), and traditional study (M = 80, SD = 3.16).

These results suggest that active retrieval practice (practice tests and flashcards) is more effective than passive review (traditional study) for exam preparation, with practice testing being the most effective method.

10. Exam Preparation Tips

Formula Memorization Tips:

  • Remember the ratio: F = MSB/MSW (between over within)
  • SS relationship: Total = Between + Within (SST = SSB + SSW)
  • df pattern: k-1 for between, N-k for within
  • MS always: SS divided by its df
  • HSD formula: q times square root of (MSW over n)

Calculation Checklist:

  • Always calculate grand mean first
  • Double-check: does SST = SSB + SSW?
  • Verify df add up: df_total = df_between + df_within
  • F-statistic should be positive (can't have negative variance)
  • If F < 1, you probably made an error (or groups are very similar)

Conceptual Understanding:

  • ANOVA = comparing group means for 3+ groups
  • F-statistic = ratio of between to within variation
  • Large F = groups differ significantly
  • Significant ANOVA → do post-hoc to find which groups differ
  • Three assumptions: independence (most critical!), normality, equal variances

Additional Resources

Related study materials:

← Back to Module 10 Home