Confidence Intervals for Means
Construct and interpret confidence intervals for population means using the t-distribution
Lesson Objectives
By the end of this lesson, you will be able to:
- Construct confidence intervals for population means
- Understand and use the t-distribution
- Calculate degrees of freedom
- Find critical t-values from a t-table
- Determine when to use z vs t distributions
- Interpret confidence intervals for means correctly
1. The Confidence Interval Formula for Means
Confidence Interval for a Population Mean (μ)
Where:
x̄ = sample mean (point estimate)
t* = critical value from t-distribution
s = sample standard deviation
n = sample size
s/√n = standard error (SE)
This formula gives us the interval (lower bound, upper bound):
In most real-world situations, we don't know the population standard deviation σ. We only have the sample standard deviation s. When we use s to estimate σ, we introduce extra uncertainty, so we use the t-distribution instead of the normal (z) distribution.
2. The t-Distribution
The t-Distribution
The t-distribution (Student's t-distribution) is similar to the standard normal distribution but has heavier tails. It's used when:
- The population standard deviation (σ) is unknown
- We use the sample standard deviation (s) as an estimate
Key properties of the t-distribution:
- Symmetric and bell-shaped (like the normal distribution)
- Centered at 0
- Has heavier tails than the normal distribution (more area in the tails)
- Shape depends on degrees of freedom (df)
- As df increases, t-distribution approaches the standard normal distribution
Degrees of Freedom
where n is the sample size
Example 1: Finding Degrees of Freedom
a) A sample of n = 25 students: df = 25 - 1 = 24
b) A sample of n = 100 voters: df = 100 - 1 = 99
c) A sample of n = 10 measurements: df = 10 - 1 = 9
Finding Critical t-Values (t*)
The critical value t* depends on two things:
- Confidence level (90%, 95%, 99%, etc.)
- Degrees of freedom (df = n - 1)
We look up t* in a t-table or use technology. Here are common values:
| df | 90% CI (t*) | 95% CI (t*) | 99% CI (t*) |
|---|---|---|---|
| 5 | 2.015 | 2.571 | 4.032 |
| 10 | 1.812 | 2.228 | 3.169 |
| 20 | 1.725 | 2.086 | 2.845 |
| 30 | 1.697 | 2.042 | 2.750 |
| 100 | 1.660 | 1.984 | 2.626 |
| ∞ (z) | 1.645 | 1.960 | 2.576 |
Notice: As df increases, t* values get closer to z* values. With large samples (df > 100), t ≈ z.
3. Constructing Confidence Intervals for Means
Steps to Construct a CI for μ:
- Check conditions:
- Random sample
- Population is approximately normal OR n ≥ 30 (Central Limit Theorem)
- If n < 30, check that data doesn't have strong skewness or outliers
- Calculate sample statistics: x̄ and s
- Determine df: df = n - 1
- Find critical value: Look up t* for desired confidence level and df
- Calculate margin of error: E = t* × (s/√n)
- Construct interval: x̄ ± E
- Interpret: State conclusion in context
Example 2: Complete CI Calculation
Problem: A nutritionist measures the daily calorie intake for a random sample of 15 college students. The sample mean is x̄ = 2250 calories with standard deviation s = 320 calories. Construct a 95% confidence interval for the mean daily calorie intake of all college students.
Solution:
Step 1: Check conditions
- Random sample (stated)
- Assume calorie intake is approximately normally distributed
Step 2: Sample statistics
- x̄ = 2250 calories
- s = 320 calories
- n = 15
Step 3: Degrees of freedom
df = n - 1 = 15 - 1 = 14
Step 4: Critical value
For 95% CI with df = 14: t* = 2.145 (from t-table)
Step 5: Margin of error
E = t* × (s/√n) = 2.145 × (320/√15) = 2.145 × 82.62 = 177.2 calories
Step 6: Construct interval
2250 ± 177.2 = (2072.8, 2427.2)
Step 7: Interpret
Conclusion: We are 95% confident that the true mean daily calorie intake for all college students is between 2073 and 2427 calories.
Example 3: Larger Sample Size
Problem: A company samples 50 employees and finds mean commute time x̄ = 32 minutes with s = 12 minutes. Find a 90% confidence interval for the mean commute time.
Solution:
Given: x̄ = 32, s = 12, n = 50
df = 50 - 1 = 49
For 90% CI with df = 49: t* ≈ 1.677 (from t-table)
E = 1.677 × (12/√50) = 1.677 × 1.697 = 2.85 minutes
CI: 32 ± 2.85 = (29.15, 34.85)
Interpretation: We are 90% confident that the true mean commute time for all employees is between 29.2 and 34.9 minutes.
4. When to Use z vs. t
| Use t-distribution when: | Use z-distribution when: |
|---|---|
|
• σ is unknown (most common) • Using sample standard deviation s • Any sample size |
• σ is known (rare in practice) • Sample size is very large (n > 100) • Working with proportions |
When in doubt, use t. The t-distribution is almost always correct for means when σ is unknown. As n increases, t approaches z anyway, so t is the safer choice.
Example 4: Choosing z vs. t
a) Sample of n = 25, σ unknown → Use t
b) Sample of n = 200, σ unknown → Use t (or z, they're almost equal)
c) Population σ = 10 is known, n = 40 → Use z
d) Proportion problem (p̂) → Use z
5. Effect of Confidence Level on Interval Width
Example 5: Comparing Confidence Levels
Given: x̄ = 100, s = 15, n = 25 (so df = 24)
90% CI:
t* = 1.711, E = 1.711 × (15/√25) = 5.13
CI: 100 ± 5.13 = (94.87, 105.13) — width = 10.26
95% CI:
t* = 2.064, E = 2.064 × (15/√25) = 6.19
CI: 100 ± 6.19 = (93.81, 106.19) — width = 12.38
99% CI:
t* = 2.797, E = 2.797 × (15/√25) = 8.39
CI: 100 ± 8.39 = (91.61, 108.39) — width = 16.78
Observation: Higher confidence → wider interval. This is the trade-off between confidence and precision.
Check Your Understanding
Question 1: A sample of n = 20 has x̄ = 50 and s = 8. What are the degrees of freedom?
Answer: df = n - 1 = 20 - 1 = 19
Question 2: Why do we use the t-distribution instead of z when σ is unknown?
Answer: Using s to estimate σ introduces extra uncertainty. The t-distribution has heavier tails to account for this additional uncertainty, giving more conservative (wider) intervals.
Question 3: For a 95% CI with df = 10, would t* be larger or smaller than z* = 1.96?
Answer: Larger. With df = 10, t* = 2.228, which is larger than z* = 1.96. This makes the interval wider to account for the smaller sample size and using s instead of σ.
Question 4: A 99% CI for μ is (45, 55). What is the margin of error?
Answer: The margin of error is 5. The interval width is 55 - 45 = 10, and margin of error is half the width: 10/2 = 5. Alternatively, the center is 50, and E = 55 - 50 = 5.
Question 5: If you want a narrower CI but keep the same confidence level, what should you do?
Answer: Increase the sample size (n). Since margin of error = t* × (s/√n), increasing n decreases s/√n, which decreases the margin of error and makes the interval narrower.
Lesson Summary
- CI for μ: x̄ ± t* × (s/√n)
- Use t-distribution when σ is unknown (most common situation)
- Degrees of freedom: df = n - 1
- Find t* from t-table using df and confidence level
- As df increases, t-distribution approaches standard normal (z)
- Higher confidence → larger t* → wider interval
- Larger sample → smaller SE → narrower interval