Paired Samples (Matched Pairs) Tests
Learn how to analyze dependent samples using paired t-tests
Lesson Objectives
By the end of this lesson, you will be able to:
- Identify situations requiring paired samples tests
- Understand why paired tests are more powerful than independent tests
- Calculate differences for paired data
- Conduct paired t-tests using the one-sample approach on differences
- Interpret results from before/after and matched pairs studies
1. What Are Paired Samples?
Definition: Paired (Dependent) Samples
Paired samples (also called matched pairs or dependent samples) occur when each observation in one sample is naturally paired or matched with a specific observation in the other sample.
Common Paired Designs
| Design Type | Description | Example |
|---|---|---|
| Before-After | Same subjects measured at two time points | Weight before and after diet program |
| Matched Subjects | Subjects matched on important characteristics | Identical twins, one in each group |
| Same Subject, Two Conditions | Each subject experiences both treatments | Left eye vs right eye vision |
| Pre-Post Testing | Performance before and after intervention | Pre-test and post-test scores |
Why Use Paired Designs?
By comparing each subject to themselves (or their matched pair), we eliminate variation due to individual differences, which would otherwise be confounding noise in the data.
Example: Why Pairing Matters
Suppose you're testing a weight loss program. You have two options:
Option 1 (Independent): Randomly assign 50 people to the diet, 50 to control. Compare final weights.
Option 2 (Paired): Measure weight before and after the diet for 50 people. Compare each person's change.
Why is Option 2 better?
People naturally have very different weights (individual variability is HIGH). In Option 1, this variability makes it harder to detect the diet's effect. In Option 2, you compare each person to themselves, eliminating this variability. You need fewer participants to detect the same effect!
2. The Paired t-Test: Process and Logic
The key insight: A paired t-test is actually a one-sample t-test on the differences!
Step-by-Step Process
-
Calculate differences for each pair
d = x₁ - x₂ (or x₂ - x₁, be consistent)
Each pair gives you one difference value -
Calculate the mean of differences (d̄)
d̄ = Σd / n
This is your sample mean difference -
Calculate the standard deviation of differences (sd)
sd = √[Σ(d - d̄)² / (n - 1)]
Measure of variability in the differences -
Conduct a one-sample t-test on the differences
Test whether the mean difference is significantly different from 0
Hypotheses for Paired t-Test
We're testing whether the mean population difference (μd) equals zero:
| Test Type | Null Hypothesis (H₀) | Alternative Hypothesis (Hₐ) |
|---|---|---|
| Two-tailed | H₀: μd = 0 | Hₐ: μd ≠ 0 |
| Right-tailed | H₀: μd = 0 | Hₐ: μd > 0 |
| Left-tailed | H₀: μd = 0 | Hₐ: μd < 0 |
Note: μd = 0 means "no difference on average" between the two conditions. Rejecting this means there IS a significant difference.
3. Test Statistic and Formula
Test Statistic for Paired t-Test
where:
d̄ = mean of the differences
μd₀ = hypothesized mean difference (usually 0)
sd = standard deviation of the differences
n = number of pairs
Degrees of Freedom
where n = number of pairs (NOT total number of observations!)
Conditions for Paired t-Test
- Random Sampling: The pairs are randomly selected
- Independence of Pairs: Each pair is independent of other pairs
- Normality of Differences:
- The differences are approximately normally distributed, OR
- The sample size is large (n ≥ 30 pairs)
4. Complete Example: Before-After Study
Example: Weight Loss Program
Research Question: Does a new diet program result in significant weight loss?
A dietitian measures the weight of 8 participants before and after a 6-week program. Test at α = 0.05.
| Participant | Before (lb) | After (lb) | Difference d = Before - After |
|---|---|---|---|
| 1 | 185 | 178 | 7 |
| 2 | 210 | 205 | 5 |
| 3 | 168 | 165 | 3 |
| 4 | 195 | 188 | 7 |
| 5 | 220 | 210 | 10 |
| 6 | 172 | 170 | 2 |
| 7 | 198 | 192 | 6 |
| 8 | 205 | 198 | 7 |
Step 1: State hypotheses
- H₀: μd = 0 (no weight loss on average)
- Hₐ: μd > 0 (weight loss occurred) — right-tailed test
Note: d = Before - After, so positive d means weight loss
Step 2: Check conditions
- Participants randomly selected
- Pairs are independent (different people)
- n = 8 < 30, so we need to assume differences are approximately normal (reasonable for weight loss)
Step 3: Calculate d̄ and sd
Mean difference:
d̄ = (7 + 5 + 3 + 7 + 10 + 2 + 6 + 7) / 8 = 47 / 8 = 5.875 lb
Standard deviation of differences:
First calculate deviations from mean and square them:
(7-5.875)² = 1.266, (5-5.875)² = 0.766, (3-5.875)² = 8.266, etc.
Sum of squared deviations = 43.875
sd = √(43.875 / 7) = √6.268 = 2.504 lb
Step 4: Calculate test statistic
t = (5.875 - 0) / (2.504 / √8)
t = 5.875 / (2.504 / 2.828)
t = 5.875 / 0.886
t ≈ 6.63
Step 5: Find critical value and p-value
df = n - 1 = 8 - 1 = 7
For α = 0.05 (right-tailed), t-critical(7) ≈ 1.895
With t = 6.63, p-value < 0.001
Step 6: Make decision
Since t = 6.63 > 1.895 (or p-value < 0.05), we reject H₀.
Step 7: Conclusion
There is sufficient evidence at the 0.05 significance level to conclude that the diet program results in significant weight loss. On average, participants lost 5.875 pounds.
5. Paired vs. Independent: When to Use Which?
| Aspect | Paired t-Test | Independent Two-Sample t-Test |
|---|---|---|
| Data Structure | Same subjects measured twice OR matched pairs | Two separate, unrelated groups |
| Sample Size | n = number of pairs | n₁ and n₂ for each group |
| What We Analyze | Differences (d = x₁ - x₂) | Two separate samples |
| Degrees of Freedom | df = n - 1 | df = complex formula or n₁ + n₂ - 2 |
| Power | Higher (controls individual variability) | Lower (more noise from individual differences) |
| Example | Blood pressure before/after medication | Blood pressure in Med A group vs Med B group |
6. Confidence Interval for Mean Difference
In addition to hypothesis testing, you can construct a confidence interval for the mean difference μd:
Confidence Interval for μd
where t* is the critical value from t-distribution with df = n - 1
Example: Confidence Interval for Weight Loss
Using the weight loss data from Example 4:
- d̄ = 5.875, sd = 2.504, n = 8, df = 7
- For 95% CI: t* ≈ 2.365
95% CI = 5.875 ± 2.365 × (2.504 / √8)
= 5.875 ± 2.365 × 0.886
= 5.875 ± 2.095
= (3.78, 7.97) pounds
Interpretation: We are 95% confident that the true mean weight loss is between 3.78 and 7.97 pounds. Since this interval does NOT contain 0, we can conclude there is significant weight loss (consistent with our hypothesis test).
Check Your Understanding
Question 1: A researcher wants to compare reading speeds before and after a speed-reading course. She tests 20 people before the course and 20 different people after the course. Should she use a paired or independent test?
Question 2: If you have 15 pairs of data, what is the degrees of freedom for a paired t-test?
Question 3: In a paired t-test, you calculate d̄ = -3.2 and get t = -2.8 with p-value = 0.01 for a two-tailed test at α = 0.05. What is your conclusion if d = Before - After?
Key Takeaways
- Paired t-tests are used when the same subjects are measured twice or when subjects are matched
- The key is to calculate differences (d) for each pair
- A paired t-test is a one-sample t-test on the differences
- Test statistic: t = (d̄ - 0) / (sd / √n) with df = n - 1
- Paired designs are more powerful than independent designs because they control for individual variability
- Hypotheses: H₀: μd = 0 vs. Hₐ: μd ≠ 0 (or <, >)
- Always specify what the difference represents (Before - After or After - Before)