Safaa Dabagh

Lesson 2: One-Way ANOVA Procedure

Step-by-Step Calculations and the ANOVA Table

What is One-Way ANOVA?

One-way ANOVA is used when we have:

Example

Factor: Teaching method (Traditional, Flipped, Project-based)

Dependent variable: Final exam score

This is "one-way" because we're looking at the effect of ONE factor (teaching method).

Note: There's also two-way ANOVA (two factors) and other variations, but we'll focus on one-way ANOVA in this course.

Understanding Sum of Squares

ANOVA breaks down the total variation in the data into two components:

1. Total Sum of Squares (SST)

Measures the total variation of all observations from the grand mean.

SST = Σ(xij - x̄)²

Where:

2. Between-Group Sum of Squares (SSB)

Measures the variation between group means.

SSB = Σnj(x̄j - x̄)²

Where:

3. Within-Group Sum of Squares (SSW)

Measures the variation within each group (error/residual).

SSW = Σ(xij - x̄j

Where:

Fundamental Relationship

SST = SSB + SSW

Total variation = Variation between groups + Variation within groups

This is the key partitioning that makes ANOVA work!

Degrees of Freedom

Each sum of squares has associated degrees of freedom (df):

dfbetween = k - 1

Where k = number of groups

dfwithin = N - k

Where N = total sample size (all observations)

dftotal = N - 1

Relationship

dftotal = dfbetween + dfwithin

(N - 1) = (k - 1) + (N - k)

Example

If we have 4 groups with 10 observations each:

  • k = 4 groups
  • N = 40 total observations
  • dfbetween = 4 - 1 = 3
  • dfwithin = 40 - 4 = 36
  • dftotal = 40 - 1 = 39

Mean Squares

Mean squares are the average squared deviations. We calculate them by dividing each sum of squares by its degrees of freedom.

Mean Square Between (MSB)

MSB = SSB / dfbetween = SSB / (k - 1)

This estimates the variance between groups.

Mean Square Within (MSW)

MSW = SSW / dfwithin = SSW / (N - k)

This estimates the variance within groups (pooled error variance).

Why Mean Squares?

We divide by degrees of freedom to get an average measure of variation that's comparable between different sample sizes. Mean squares are estimates of variance.

The F-Statistic

Finally, we calculate the F-statistic by comparing the two mean squares:

F = MSB / MSW

Interpretation

  • If MSB >> MSW (F is large): Between-group differences are large compared to within-group variation → groups likely differ
  • If MSB ≈ MSW (F ≈ 1): Between-group differences are similar to within-group variation → groups likely don't differ

The F-statistic follows an F-distribution with:

The ANOVA Table

We organize all ANOVA calculations in a standard table format:

Source of Variation Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F-statistic
Between Groups SSB k - 1 MSB = SSB/(k-1) F = MSB/MSW
Within Groups SSW N - k MSW = SSW/(N-k)
Total SST N - 1

Complete Step-by-Step Example

Scenario

A professor wants to compare the effectiveness of three study methods on exam performance. She randomly assigns 15 students to three groups (5 students per group) and records their exam scores:

Method 1 (Traditional) Method 2 (Flashcards) Method 3 (Practice Tests)
78 85 92
82 88 95
76 84 90
80 86 93
84 87 90

Test at α = 0.05: Do the study methods produce different mean exam scores?

1State the Hypotheses

  • H₀: μ₁ = μ₂ = μ₃ (all three methods have equal mean scores)
  • Hₐ: At least one method has a different mean score

2Calculate Group Means and Grand Mean

Method 1: x̄₁ = (78 + 82 + 76 + 80 + 84) / 5 = 400 / 5 = 80
Method 2: x̄₂ = (85 + 88 + 84 + 86 + 87) / 5 = 430 / 5 = 86
Method 3: x̄₃ = (92 + 95 + 90 + 93 + 90) / 5 = 460 / 5 = 92
Grand Mean: x̄ = (400 + 430 + 460) / 15 = 1290 / 15 = 86

3Calculate Sum of Squares Between (SSB)

SSB = Σnj(x̄j - x̄)²

SSB = 5(80-86)² + 5(86-86)² + 5(92-86)²
SSB = 5(-6)² + 5(0)² + 5(6)²
SSB = 5(36) + 0 + 5(36)
SSB = 180 + 0 + 180 = 360

4Calculate Sum of Squares Within (SSW)

SSW = Σ(xij - x̄j)² for all groups

Method 1:

(78-80)² + (82-80)² + (76-80)² + (80-80)² + (84-80)² = 4 + 4 + 16 + 0 + 16 = 40

Method 2:

(85-86)² + (88-86)² + (84-86)² + (86-86)² + (87-86)² = 1 + 4 + 4 + 0 + 1 = 10

Method 3:

(92-92)² + (95-92)² + (90-92)² + (93-92)² + (90-92)² = 0 + 9 + 4 + 1 + 4 = 18
SSW = 40 + 10 + 18 = 68

5Calculate Total Sum of Squares (SST)

We can verify: SST = SSB + SSW

SST = 360 + 68 = 428

6Calculate Degrees of Freedom

dfbetween = k - 1 = 3 - 1 = 2
dfwithin = N - k = 15 - 3 = 12
dftotal = N - 1 = 15 - 1 = 14

7Calculate Mean Squares

MSB = SSB / dfbetween = 360 / 2 = 180
MSW = SSW / dfwithin = 68 / 12 = 5.67

8Calculate F-Statistic

F = MSB / MSW = 180 / 5.67 = 31.75

9Complete ANOVA Table

Source SS df MS F
Between Groups 360 2 180 31.75
Within Groups 68 12 5.67
Total 428 14

10Make Decision

Using an F-table with df₁ = 2, df₂ = 12, and α = 0.05:

Critical value: Fcritical = 3.89

F = 31.75 > 3.89 → Reject H₀

Alternatively, using technology: p-value < 0.001

Since p-value < 0.05, we reject H₀.

11State Conclusion

Conclusion: At the 0.05 significance level, there is sufficient evidence to conclude that at least one study method produces a different mean exam score.

Note: We don't yet know WHICH methods differ. We'll learn about post-hoc tests in Lesson 3!

Check Your Understanding

Question 1

If you have 4 groups with sample sizes n₁ = 8, n₂ = 10, n₃ = 12, n₄ = 10, what are the degrees of freedom?

  • k = 4 groups
  • N = 8 + 10 + 12 + 10 = 40 total observations
  • dfbetween = k - 1 = 4 - 1 = 3
  • dfwithin = N - k = 40 - 4 = 36
  • dftotal = N - 1 = 40 - 1 = 39

Question 2

Given SSB = 240, SSW = 160, dfbetween = 3, dfwithin = 36, calculate the F-statistic.

Step 1: Calculate MSB

MSB = SSB / dfbetween = 240 / 3 = 80

Step 2: Calculate MSW

MSW = SSW / dfwithin = 160 / 36 = 4.44

Step 3: Calculate F

F = MSB / MSW = 80 / 4.44 = 18.02

This is a large F-value, suggesting strong evidence of differences among groups!

Question 3

Why do we divide sum of squares by degrees of freedom to get mean squares?

Answer: We divide by degrees of freedom to get an average measure of variation that accounts for sample size.

This makes mean squares (which are variance estimates) comparable even when groups have different sample sizes. Without this adjustment, larger samples would always have larger sums of squares simply due to having more observations, even if the actual variability is the same.

Lesson Summary