Save or print this lesson:

Lesson 2: One-Way ANOVA Procedure

Step-by-Step Calculations and the ANOVA Table

What is One-Way ANOVA?

One-way ANOVA is used when we have:

One categorical independent variable (called a factor) with 3+ levels (groups)
One quantitative dependent variable (the measurement we're comparing)

Example

Factor: Teaching method (Traditional, Flipped, Project-based)

Dependent variable: Final exam score

This is "one-way" because we're looking at the effect of ONE factor (teaching method).

Note: There's also two-way ANOVA (two factors) and other variations, but we'll focus on one-way ANOVA in this course.

Understanding Sum of Squares

ANOVA breaks down the total variation in the data into two components:

1. Total Sum of Squares (SST)

Measures the total variation of all observations from the grand mean.

SST = Σ(x_ij - x̄)²

Where:

x_ij = individual observation (i-th observation in j-th group)
x̄ = grand mean (mean of ALL observations)

2. Between-Group Sum of Squares (SSB)

Measures the variation between group means.

SSB = Σn_j(x̄_j - x̄)²

Where:

n_j = sample size of group j
x̄_j = mean of group j
x̄ = grand mean

3. Within-Group Sum of Squares (SSW)

Measures the variation within each group (error/residual).

SSW = Σ(x_ij - x̄_j)²

Where:

x_ij = individual observation in group j
x̄_j = mean of group j

Fundamental Relationship

SST = SSB + SSW

Total variation = Variation between groups + Variation within groups

This is the key partitioning that makes ANOVA work!

Degrees of Freedom

Each sum of squares has associated degrees of freedom (df):

df_between = k - 1

Where k = number of groups

df_within = N - k

Where N = total sample size (all observations)

df_total = N - 1

Relationship

df_total = df_between + df_within

(N - 1) = (k - 1) + (N - k)

Example

If we have 4 groups with 10 observations each:

k = 4 groups
N = 40 total observations
df_between = 4 - 1 = 3
df_within = 40 - 4 = 36
df_total = 40 - 1 = 39

Mean Squares

Mean squares are the average squared deviations. We calculate them by dividing each sum of squares by its degrees of freedom.

Mean Square Between (MSB)

MSB = SSB / df_between = SSB / (k - 1)

This estimates the variance between groups.

Mean Square Within (MSW)

MSW = SSW / df_within = SSW / (N - k)

This estimates the variance within groups (pooled error variance).

Why Mean Squares?

We divide by degrees of freedom to get an average measure of variation that's comparable between different sample sizes. Mean squares are estimates of variance.

The F-Statistic

Finally, we calculate the F-statistic by comparing the two mean squares:

F = MSB / MSW

Interpretation

If MSB >> MSW (F is large): Between-group differences are large compared to within-group variation → groups likely differ
If MSB ≈ MSW (F ≈ 1): Between-group differences are similar to within-group variation → groups likely don't differ

The F-statistic follows an F-distribution with:

df₁ = df_between (numerator degrees of freedom)
df₂ = df_within (denominator degrees of freedom)

The ANOVA Table

We organize all ANOVA calculations in a standard table format:

Source of Variation	Sum of Squares (SS)	Degrees of Freedom (df)	Mean Square (MS)	F-statistic
Between Groups	SSB	k - 1	MSB = SSB/(k-1)	F = MSB/MSW
Within Groups	SSW	N - k	MSW = SSW/(N-k)	—
Total	SST	N - 1	—	—

Complete Step-by-Step Example

Scenario

A professor wants to compare the effectiveness of three study methods on exam performance. She randomly assigns 15 students to three groups (5 students per group) and records their exam scores:

Method 1 (Traditional)	Method 2 (Flashcards)	Method 3 (Practice Tests)
78	85	92
82	88	95
76	84	90
80	86	93
84	87	90

Test at α = 0.05: Do the study methods produce different mean exam scores?

1State the Hypotheses

H₀: μ₁ = μ₂ = μ₃ (all three methods have equal mean scores)
Hₐ: At least one method has a different mean score

2Calculate Group Means and Grand Mean

Method 1: x̄₁ = (78 + 82 + 76 + 80 + 84) / 5 = 400 / 5 = 80

Method 2: x̄₂ = (85 + 88 + 84 + 86 + 87) / 5 = 430 / 5 = 86

Method 3: x̄₃ = (92 + 95 + 90 + 93 + 90) / 5 = 460 / 5 = 92

Grand Mean: x̄ = (400 + 430 + 460) / 15 = 1290 / 15 = 86

3Calculate Sum of Squares Between (SSB)

SSB = Σn_j(x̄_j - x̄)²

SSB = 5(80-86)² + 5(86-86)² + 5(92-86)²

SSB = 5(-6)² + 5(0)² + 5(6)²

SSB = 5(36) + 0 + 5(36)

SSB = 180 + 0 + 180 = 360

4Calculate Sum of Squares Within (SSW)

SSW = Σ(x_ij - x̄_j)² for all groups

Method 1:

(78-80)² + (82-80)² + (76-80)² + (80-80)² + (84-80)² = 4 + 4 + 16 + 0 + 16 = 40

Method 2:

(85-86)² + (88-86)² + (84-86)² + (86-86)² + (87-86)² = 1 + 4 + 4 + 0 + 1 = 10

Method 3:

(92-92)² + (95-92)² + (90-92)² + (93-92)² + (90-92)² = 0 + 9 + 4 + 1 + 4 = 18

SSW = 40 + 10 + 18 = 68

5Calculate Total Sum of Squares (SST)

We can verify: SST = SSB + SSW

SST = 360 + 68 = 428

6Calculate Degrees of Freedom

df_between = k - 1 = 3 - 1 = 2

df_within = N - k = 15 - 3 = 12

df_total = N - 1 = 15 - 1 = 14

7Calculate Mean Squares

MSB = SSB / df_between = 360 / 2 = 180

MSW = SSW / df_within = 68 / 12 = 5.67

8Calculate F-Statistic

F = MSB / MSW = 180 / 5.67 = 31.75

9Complete ANOVA Table

Source	SS	df	MS	F
Between Groups	360	2	180	31.75
Within Groups	68	12	5.67	—
Total	428	14	—	—

10Make Decision

Using an F-table with df₁ = 2, df₂ = 12, and α = 0.05:

Critical value: F_critical = 3.89

F = 31.75 > 3.89 → Reject H₀

Alternatively, using technology: p-value < 0.001

Since p-value < 0.05, we reject H₀.

11State Conclusion

Conclusion: At the 0.05 significance level, there is sufficient evidence to conclude that at least one study method produces a different mean exam score.

Note: We don't yet know WHICH methods differ. We'll learn about post-hoc tests in Lesson 3!

Check Your Understanding

Question 1

If you have 4 groups with sample sizes n₁ = 8, n₂ = 10, n₃ = 12, n₄ = 10, what are the degrees of freedom?

k = 4 groups
N = 8 + 10 + 12 + 10 = 40 total observations
df_between = k - 1 = 4 - 1 = 3
df_within = N - k = 40 - 4 = 36
df_total = N - 1 = 40 - 1 = 39

Question 2

Given SSB = 240, SSW = 160, df_between = 3, df_within = 36, calculate the F-statistic.

Step 1: Calculate MSB

MSB = SSB / df_between = 240 / 3 = 80

Step 2: Calculate MSW

MSW = SSW / df_within = 160 / 36 = 4.44

Step 3: Calculate F

F = MSB / MSW = 80 / 4.44 = 18.02

This is a large F-value, suggesting strong evidence of differences among groups!

Question 3

Why do we divide sum of squares by degrees of freedom to get mean squares?

Answer: We divide by degrees of freedom to get an average measure of variation that accounts for sample size.

This makes mean squares (which are variance estimates) comparable even when groups have different sample sizes. Without this adjustment, larger samples would always have larger sums of squares simply due to having more observations, even if the actual variability is the same.

Lesson Summary

One-way ANOVA: One categorical factor, one quantitative dependent variable
Sum of Squares:
- SST = total variation
- SSB = between-group variation
- SSW = within-group variation (error)
- SST = SSB + SSW
Degrees of Freedom: df_between = k-1, df_within = N-k
Mean Squares: MS = SS / df (average variation)
F-statistic: F = MSB / MSW
ANOVA table organizes all calculations systematically
Large F-values (relative to critical value) lead to rejecting H₀

← Previous: Introduction to ANOVA Next: Post-Hoc Tests →