Study Guide - Module 10: ANOVA

This study guide contains everything you need to master ANOVA:

All key concepts and definitions
Complete formulas with explanations
Step-by-step procedures
Assumptions and when to use ANOVA
Common mistakes to avoid

Print this guide for exam preparation!

1. Key Concepts and Definitions

What is ANOVA?

ANOVA (Analysis of Variance) is a statistical method used to compare the means of three or more groups simultaneously.

Purpose: Test if at least one population mean differs from the others
Hypotheses:
• H₀: μ₁ = μ₂ = μ₃ = ... (all means are equal)
• Hₐ: At least one mean is different

Why ANOVA Instead of Multiple t-Tests?

α Inflation Problem:
Multiple t-tests inflate the Type I error rate. With k groups requiring k(k-1)/2 comparisons, the overall α increases dramatically.
Example: 4 groups = 6 t-tests → overall α ≈ 26.5% (not 5%)!
Solution: ANOVA performs ONE test, maintaining α = 0.05

The Basic Idea

ANOVA compares two types of variation:

Between-group variation: How much group means differ from each other
Within-group variation: How much individuals vary within each group (error)

If between-group variation >> within-group variation → groups differ significantly

2. Complete Formula Reference

Sum of Squares (SS)

Total Sum of Squares (SST):
SST = Σ(x_ij - x̄)²
Total variation of all observations from grand mean

Between-Group Sum of Squares (SSB):
SSB = Σn_j(x̄_j - x̄)²
Variation due to group differences

Within-Group Sum of Squares (SSW):
SSW = Σ(x_ij - x̄_j)²
Variation within groups (error)

Fundamental Relationship:
SST = SSB + SSW

Degrees of Freedom (df)

df_between = k - 1 (k = number of groups)
df_within = N - k (N = total sample size)
df_total = N - 1

Verification: df_total = df_between + df_within

Mean Squares (MS)

MSB = SSB / df_between = SSB / (k-1)
Average between-group variance

MSW = SSW / df_within = SSW / (N-k)
Average within-group variance (pooled error)

F-Statistic

F = MSB / MSW

Interpretation:
• F ≈ 1: Groups are similar
• F >> 1: Strong evidence of group differences
• F < 1: Groups very similar (fail to reject H₀)

Tukey's HSD (Post-Hoc Test)

HSD = q × √(MSW / n)

Where:
• q = Studentized Range critical value (from table)
• MSW = Mean Square Within
• n = sample size per group (assumes equal n)

Decision Rule:
If |x̄_i - x̄_j| > HSD → Groups i and j differ significantly

Bonferroni Correction

α_adjusted = α / c

Where:
• α = original significance level (e.g., 0.05)
• c = number of pairwise comparisons = k(k-1)/2

Use α_adjusted for each individual comparison

3. ANOVA Table Template

Source of Variation	Sum of Squares (SS)	df	Mean Square (MS)	F
Between Groups	SSB	k - 1	MSB = SSB/(k-1)	F = MSB/MSW
Within Groups	SSW	N - k	MSW = SSW/(N-k)	—
Total	SST	N - 1	—	—

4. Complete ANOVA Procedure

Step-by-Step Guide

State hypotheses
- H₀: μ₁ = μ₂ = ... = μₖ
- Hₐ: At least one μ is different
Choose significance level (usually α = 0.05)
Check assumptions (independence, normality, equal variances)
Calculate group means (x̄₁, x̄₂, ..., x̄ₖ) and grand mean (x̄)
Calculate sum of squares (SSB, SSW, SST)
Calculate degrees of freedom (df_between, df_within)
Calculate mean squares (MSB, MSW)
Calculate F-statistic (F = MSB/MSW)
Find p-value or critical value using F-distribution with (df₁, df₂)
Make decision
- If p ≤ α (or F ≥ F_critical): Reject H₀
- If p > α (or F < F_critical): Fail to reject H₀
State conclusion in context
If significant: Conduct post-hoc tests to identify which groups differ

5. ANOVA Assumptions

Three Key Assumptions

Assumption	How to Check	What If Violated?
1. Independence Observations are independent	• Study design • Random assignment • No repeated measures	• Use Repeated Measures ANOVA • Use mixed models • CRITICAL - must address!
2. Normality Populations are normally distributed	• Histograms • Q-Q plots • Shapiro-Wilk test	• Kruskal-Wallis test • Transform data • OK if n ≥ 30 (CLT)
3. Equal Variances σ₁² = σ₂² = ... = σₖ²	• Rule: max(s²)/min(s²) < 2 • Boxplots • Levene's test	• Welch's ANOVA • Kruskal-Wallis test • OK if equal sample sizes

Robustness: ANOVA is fairly robust to violations of normality and equal variance when:

Sample sizes are large (n ≥ 30 per group)
Sample sizes are equal across groups
Violations are moderate (not severe)

6. Post-Hoc Tests Summary

Test	When to Use	Characteristics
Tukey's HSD	• After significant ANOVA • Equal sample sizes • All pairwise comparisons	• Most common • Good power • Controls family-wise error
Bonferroni	• Unequal sample sizes • Few specific comparisons • Need extra conservative	• Simple to use • Very conservative • Adjusts α for each test
Scheffé	• Complex comparisons • Maximum error protection	• Most conservative • Least powerful
Dunnett's	• Comparing treatments to control • Not comparing treatments to each other	• More powerful than Tukey for this purpose • Common in medical research

Important Rules:

Only do post-hoc tests AFTER significant ANOVA
Don't do post-hoc with only 2 groups
Choose ONE post-hoc method (don't mix)

7. Which Test Should I Use?

Situation	Appropriate Test
Comparing 2 groups, means, independent	Two-sample t-test
Comparing 3+ groups, means, independent, assumptions met	One-way ANOVA
Comparing 3+ groups, severe non-normality, small n	Kruskal-Wallis test
Comparing 3+ groups, unequal variances	Welch's ANOVA
Same subjects measured multiple times	Repeated Measures ANOVA
Two or more factors/independent variables	Two-way ANOVA (Factorial ANOVA)
Testing relationship between categorical variables	Chi-square test

8. Common Mistakes to Avoid

DON'T:

Use multiple t-tests when comparing 3+ groups (α inflation!)
Do post-hoc tests when ANOVA is not significant
Confuse SSB and SSW (between vs. within)
Forget to divide by df when calculating mean squares
Use regular ANOVA with repeated measures data
Assume ANOVA tells you WHICH groups differ (need post-hoc!)
Ignore assumptions - always check them!
Report F-statistic without degrees of freedom

DO:

Check all three assumptions before running ANOVA
Report F-statistic with both degrees of freedom: F(df₁, df₂)
Verify SST = SSB + SSW as a calculation check
Use post-hoc tests only after significant ANOVA
Choose post-hoc method before looking at data
Report effect size (not just significance)
Interpret results in context of the research question
Consider practical significance, not just statistical significance

9. Sample ANOVA Report

Example of how to report ANOVA results professionally:

A one-way ANOVA was conducted to compare the effectiveness of three study methods (traditional, flashcards, practice tests) on exam performance. Participants (N = 15) were randomly assigned to one of three groups (n = 5 per group). The assumptions of independence, normality, and equal variances were met.

The ANOVA revealed a statistically significant difference in mean exam scores among the three study methods, F(2, 12) = 31.75, p < .001. Post-hoc comparisons using Tukey's HSD indicated that all three methods produced significantly different mean scores (HSD = 4.01). Practice tests (M = 92, SD = 2.24) yielded the highest scores, followed by flashcards (M = 86, SD = 1.58), and traditional study (M = 80, SD = 3.16).

These results suggest that active retrieval practice (practice tests and flashcards) is more effective than passive review (traditional study) for exam preparation, with practice testing being the most effective method.

10. Exam Preparation Tips

Formula Memorization Tips:

Remember the ratio: F = MSB/MSW (between over within)
SS relationship: Total = Between + Within (SST = SSB + SSW)
df pattern: k-1 for between, N-k for within
MS always: SS divided by its df
HSD formula: q times square root of (MSW over n)

Calculation Checklist:

Always calculate grand mean first
Double-check: does SST = SSB + SSW?
Verify df add up: df_total = df_between + df_within
F-statistic should be positive (can't have negative variance)
If F < 1, you probably made an error (or groups are very similar)

Conceptual Understanding:

ANOVA = comparing group means for 3+ groups
F-statistic = ratio of between to within variation
Large F = groups differ significantly
Significant ANOVA → do post-hoc to find which groups differ
Three assumptions: independence (most critical!), normality, equal variances

Additional Resources

Related study materials:

Quick Reference Card - One-page formula sheet
Practice Problems - 20 comprehensive problems
Module Quiz - Test your knowledge

← Back to Module 10 Home

Safaa Dabagh

Comprehensive Study Guide