Comprehensive Study Guide
Module 10: Analysis of Variance (ANOVA)
This study guide contains everything you need to master ANOVA:
- All key concepts and definitions
- Complete formulas with explanations
- Step-by-step procedures
- Assumptions and when to use ANOVA
- Common mistakes to avoid
Print this guide for exam preparation!
1. Key Concepts and Definitions
What is ANOVA?
ANOVA (Analysis of Variance) is a statistical method used to compare the means of three or more groups simultaneously.
Hypotheses:
• H₀: μ₁ = μ₂ = μ₃ = ... (all means are equal)
• Hₐ: At least one mean is different
Why ANOVA Instead of Multiple t-Tests?
Multiple t-tests inflate the Type I error rate. With k groups requiring k(k-1)/2 comparisons, the overall α increases dramatically.
Example: 4 groups = 6 t-tests → overall α ≈ 26.5% (not 5%)!
Solution: ANOVA performs ONE test, maintaining α = 0.05
The Basic Idea
ANOVA compares two types of variation:
- Between-group variation: How much group means differ from each other
- Within-group variation: How much individuals vary within each group (error)
If between-group variation >> within-group variation → groups differ significantly
2. Complete Formula Reference
Sum of Squares (SS)
Total Sum of Squares (SST):SST = Σ(xij - x̄)²
Total variation of all observations from grand mean
Between-Group Sum of Squares (SSB):
SSB = Σnj(x̄j - x̄)²
Variation due to group differences
Within-Group Sum of Squares (SSW):
SSW = Σ(xij - x̄j)²
Variation within groups (error)
Fundamental Relationship:
SST = SSB + SSW
Degrees of Freedom (df)
dfbetween = k - 1 (k = number of groups)dfwithin = N - k (N = total sample size)
dftotal = N - 1
Verification: dftotal = dfbetween + dfwithin
Mean Squares (MS)
MSB = SSB / dfbetween = SSB / (k-1)Average between-group variance
MSW = SSW / dfwithin = SSW / (N-k)
Average within-group variance (pooled error)
F-Statistic
F = MSB / MSWInterpretation:
• F ≈ 1: Groups are similar
• F >> 1: Strong evidence of group differences
• F < 1: Groups very similar (fail to reject H₀)
Tukey's HSD (Post-Hoc Test)
HSD = q × √(MSW / n)Where:
• q = Studentized Range critical value (from table)
• MSW = Mean Square Within
• n = sample size per group (assumes equal n)
Decision Rule:
If |x̄i - x̄j| > HSD → Groups i and j differ significantly
Bonferroni Correction
αadjusted = α / cWhere:
• α = original significance level (e.g., 0.05)
• c = number of pairwise comparisons = k(k-1)/2
Use αadjusted for each individual comparison
3. ANOVA Table Template
| Source of Variation | Sum of Squares (SS) | df | Mean Square (MS) | F |
|---|---|---|---|---|
| Between Groups | SSB | k - 1 | MSB = SSB/(k-1) | F = MSB/MSW |
| Within Groups | SSW | N - k | MSW = SSW/(N-k) | — |
| Total | SST | N - 1 | — | — |
4. Complete ANOVA Procedure
Step-by-Step Guide
- State hypotheses
- H₀: μ₁ = μ₂ = ... = μₖ
- Hₐ: At least one μ is different
- Choose significance level (usually α = 0.05)
- Check assumptions (independence, normality, equal variances)
- Calculate group means (x̄₁, x̄₂, ..., x̄ₖ) and grand mean (x̄)
- Calculate sum of squares (SSB, SSW, SST)
- Calculate degrees of freedom (dfbetween, dfwithin)
- Calculate mean squares (MSB, MSW)
- Calculate F-statistic (F = MSB/MSW)
- Find p-value or critical value using F-distribution with (df₁, df₂)
- Make decision
- If p ≤ α (or F ≥ Fcritical): Reject H₀
- If p > α (or F < Fcritical): Fail to reject H₀
- State conclusion in context
- If significant: Conduct post-hoc tests to identify which groups differ
5. ANOVA Assumptions
Three Key Assumptions
| Assumption | How to Check | What If Violated? |
|---|---|---|
| 1. Independence Observations are independent |
• Study design • Random assignment • No repeated measures |
• Use Repeated Measures ANOVA • Use mixed models • CRITICAL - must address! |
| 2. Normality Populations are normally distributed |
• Histograms • Q-Q plots • Shapiro-Wilk test |
• Kruskal-Wallis test • Transform data • OK if n ≥ 30 (CLT) |
| 3. Equal Variances σ₁² = σ₂² = ... = σₖ² |
• Rule: max(s²)/min(s²) < 2 • Boxplots • Levene's test |
• Welch's ANOVA • Kruskal-Wallis test • OK if equal sample sizes |
- Sample sizes are large (n ≥ 30 per group)
- Sample sizes are equal across groups
- Violations are moderate (not severe)
6. Post-Hoc Tests Summary
| Test | When to Use | Characteristics |
|---|---|---|
| Tukey's HSD | • After significant ANOVA • Equal sample sizes • All pairwise comparisons |
• Most common • Good power • Controls family-wise error |
| Bonferroni | • Unequal sample sizes • Few specific comparisons • Need extra conservative |
• Simple to use • Very conservative • Adjusts α for each test |
| Scheffé | • Complex comparisons • Maximum error protection |
• Most conservative • Least powerful |
| Dunnett's | • Comparing treatments to control • Not comparing treatments to each other |
• More powerful than Tukey for this purpose • Common in medical research |
- Only do post-hoc tests AFTER significant ANOVA
- Don't do post-hoc with only 2 groups
- Choose ONE post-hoc method (don't mix)
7. Which Test Should I Use?
| Situation | Appropriate Test |
|---|---|
| Comparing 2 groups, means, independent | Two-sample t-test |
| Comparing 3+ groups, means, independent, assumptions met | One-way ANOVA |
| Comparing 3+ groups, severe non-normality, small n | Kruskal-Wallis test |
| Comparing 3+ groups, unequal variances | Welch's ANOVA |
| Same subjects measured multiple times | Repeated Measures ANOVA |
| Two or more factors/independent variables | Two-way ANOVA (Factorial ANOVA) |
| Testing relationship between categorical variables | Chi-square test |
8. Common Mistakes to Avoid
DON'T:
- Use multiple t-tests when comparing 3+ groups (α inflation!)
- Do post-hoc tests when ANOVA is not significant
- Confuse SSB and SSW (between vs. within)
- Forget to divide by df when calculating mean squares
- Use regular ANOVA with repeated measures data
- Assume ANOVA tells you WHICH groups differ (need post-hoc!)
- Ignore assumptions - always check them!
- Report F-statistic without degrees of freedom
DO:
- Check all three assumptions before running ANOVA
- Report F-statistic with both degrees of freedom: F(df₁, df₂)
- Verify SST = SSB + SSW as a calculation check
- Use post-hoc tests only after significant ANOVA
- Choose post-hoc method before looking at data
- Report effect size (not just significance)
- Interpret results in context of the research question
- Consider practical significance, not just statistical significance
9. Sample ANOVA Report
Example of how to report ANOVA results professionally:
A one-way ANOVA was conducted to compare the effectiveness of three study methods (traditional, flashcards, practice tests) on exam performance. Participants (N = 15) were randomly assigned to one of three groups (n = 5 per group). The assumptions of independence, normality, and equal variances were met.
The ANOVA revealed a statistically significant difference in mean exam scores among the three study methods, F(2, 12) = 31.75, p < .001. Post-hoc comparisons using Tukey's HSD indicated that all three methods produced significantly different mean scores (HSD = 4.01). Practice tests (M = 92, SD = 2.24) yielded the highest scores, followed by flashcards (M = 86, SD = 1.58), and traditional study (M = 80, SD = 3.16).
These results suggest that active retrieval practice (practice tests and flashcards) is more effective than passive review (traditional study) for exam preparation, with practice testing being the most effective method.
10. Exam Preparation Tips
Formula Memorization Tips:
- Remember the ratio: F = MSB/MSW (between over within)
- SS relationship: Total = Between + Within (SST = SSB + SSW)
- df pattern: k-1 for between, N-k for within
- MS always: SS divided by its df
- HSD formula: q times square root of (MSW over n)
Calculation Checklist:
- Always calculate grand mean first
- Double-check: does SST = SSB + SSW?
- Verify df add up: df_total = df_between + df_within
- F-statistic should be positive (can't have negative variance)
- If F < 1, you probably made an error (or groups are very similar)
Conceptual Understanding:
- ANOVA = comparing group means for 3+ groups
- F-statistic = ratio of between to within variation
- Large F = groups differ significantly
- Significant ANOVA → do post-hoc to find which groups differ
- Three assumptions: independence (most critical!), normality, equal variances
Additional Resources
Related study materials:
- Quick Reference Card - One-page formula sheet
- Practice Problems - 20 comprehensive problems
- Module Quiz - Test your knowledge