Introduction to Confidence Intervals
Learn what confidence intervals are, how to interpret them, and why we use them
Lesson Objectives
By the end of this lesson, you will be able to:
- Distinguish between point estimates and interval estimates
- Explain what a confidence interval is
- Interpret confidence levels correctly
- Understand margin of error and its components
- Avoid common misconceptions about confidence intervals
1. Point Estimates vs. Interval Estimates
Key Definitions
Point Estimate: A single number used to estimate a population parameter.
Interval Estimate (Confidence Interval): A range of values used to estimate a population parameter.
When we take a sample and calculate statistics like x̄ or p̂, we get point estimates of the population parameters μ and p. But we know from sampling distributions that different samples give different estimates due to sampling variability.
Example 1: Point Estimate
Question: A university surveys 200 students and finds the average GPA is x̄ = 3.24.
Point estimate: Our best guess for the population mean GPA is μ ≈ 3.24.
The problem: We know x̄ = 3.24 is probably not exactly equal to μ. Different samples would give different values. How confident should we be in this estimate?
This is where interval estimates (confidence intervals) come in. Instead of saying "μ is approximately 3.24," we say something like:
"We are 95% confident that μ is between 3.15 and 3.33"
This interval gives us a range of plausible values for the parameter, along with a measure of our confidence in that range.
2. What is a Confidence Interval?
Definition: Confidence Interval
A confidence interval is a range of values (an interval) that is likely to contain an unknown population parameter. It consists of:
- A point estimate (like x̄ or p̂)
- A margin of error (how much uncertainty we have)
- A confidence level (how confident we are, typically 90%, 95%, or 99%)
General Form of a Confidence Interval
Or equivalently: (Lower Bound, Upper Bound)
Example 2: Understanding the Structure
A poll reports: "52% of voters support the measure (margin of error ± 3% at 95% confidence)"
Breaking it down:
- Point estimate: p̂ = 0.52 (52%)
- Margin of error: 0.03 (3 percentage points)
- Confidence level: 95%
- Confidence interval: 0.52 ± 0.03 → (0.49, 0.55) or 49% to 55%
Interpretation: We are 95% confident that the true proportion of voters who support the measure is between 49% and 55%.
3. Confidence Level
The confidence level (usually 90%, 95%, or 99%) tells us how confident we are that our interval captures the true parameter.
If we were to take many random samples and construct a 95% confidence interval from each sample, about 95% of those intervals would contain the true population parameter.
For any single interval, the parameter either is or is not in that specific interval—but we're 95% confident it is.
Common Confidence Levels
| Confidence Level | Success Rate | When to Use |
|---|---|---|
| 90% | 9 out of 10 intervals capture μ | Less precision needed |
| 95% | 19 out of 20 intervals capture μ | Standard choice (most common) |
| 99% | 99 out of 100 intervals capture μ | High precision required |
Example 3: Visualizing Confidence Level
Imagine taking 20 random samples from a population where μ = 100. For each sample, we construct a 95% confidence interval:
- Sample 1: (97.5, 102.3) Contains μ = 100
- Sample 2: (98.1, 103.7) Contains μ = 100
- Sample 3: (95.8, 100.4) Contains μ = 100
- ...
- Sample 19: (99.2, 104.1) Contains μ = 100
- Sample 20: (101.5, 106.2) Does NOT contain μ = 100
Result: 19 out of 20 intervals (95%) captured the true mean. One interval (5%) did not. This is what "95% confidence" means in the long run!
4. Margin of Error
Definition: Margin of Error
The margin of error (E) is the amount we add and subtract from the point estimate to create the confidence interval. It represents the maximum expected difference between the point estimate and the true parameter.
Margin of Error (General Formula)
Critical value: Depends on confidence level (from z-table or t-table)
Standard error: Measures sampling variability (depends on σ and n)
What affects the margin of error?
- Confidence level: Higher confidence → larger margin of error (wider interval)
- Sample size (n): Larger sample → smaller margin of error (narrower interval)
- Population variability (σ): More variability → larger margin of error
Example 4: Effect of Confidence Level on Margin of Error
Suppose we have x̄ = 50, s = 10, n = 100.
- 90% CI: 50 ± 1.65 → (48.35, 51.65) — narrower interval
- 95% CI: 50 ± 1.96 → (48.04, 51.96) — moderate width
- 99% CI: 50 ± 2.58 → (47.42, 52.58) — wider interval
Trade-off: Higher confidence gives wider intervals. To be more confident we've captured the parameter, we need to cast a wider net.
5. Interpreting Confidence Intervals
Correct Interpretations
If we have a 95% CI of (48, 52) for μ:
- "We are 95% confident that the true population mean is between 48 and 52."
- "This interval was constructed using a method that captures the true mean 95% of the time."
- "If we repeated this process many times, 95% of intervals would contain μ."
Incorrect Interpretations
Common mistakes to AVOID:
- "There is a 95% chance that μ is between 48 and 52."
Why wrong: The parameter μ is fixed (not random). It either is or isn't in the interval. The randomness is in the sampling process, not the parameter. - "95% of the data falls between 48 and 52."
Why wrong: The CI is about the population parameter (μ), not individual data points. - "The sample mean has a 95% chance of being between 48 and 52."
Why wrong: We already know the sample mean! The CI is about the unknown population parameter.
Example 5: Practicing Interpretations
A researcher reports: "The 95% confidence interval for the average study time is (12.5, 15.3) hours per week."
Good interpretation:
"We are 95% confident that the true average study time for all students in the population is between 12.5 and 15.3 hours per week."
Bad interpretation:
"There is a 95% probability that the average study time is between 12.5 and 15.3 hours."
Check Your Understanding
Question 1: What is the main advantage of using an interval estimate instead of a point estimate?
Answer: An interval estimate provides a range of plausible values along with a measure of confidence, acknowledging sampling variability. A point estimate is just one number with no indication of how precise or reliable it is.
Question 2: A 90% confidence interval for μ is (45, 55). Does this mean there's a 90% chance that μ is between 45 and 55?
Answer: No! The parameter μ is fixed—it either is or isn't in the interval. The 90% refers to our confidence in the method: if we constructed many such intervals, 90% would contain μ.
Question 3: Which confidence level produces the widest interval: 90%, 95%, or 99%?
Answer: 99% confidence produces the widest interval. To be more confident we've captured the parameter, we need a wider net (larger margin of error).
Question 4: A poll reports "45% support the policy (margin of error ± 4%)". What is the confidence interval?
Answer: The confidence interval is 45% ± 4% = (41%, 49%). We're confident (usually 95% unless stated otherwise) that the true population proportion is between 41% and 49%.
Question 5: If you want a narrower (more precise) confidence interval, should you increase or decrease the sample size?
Answer: Increase the sample size. Larger samples reduce the standard error, which reduces the margin of error, making the interval narrower. This is why polls with larger samples have smaller margins of error.
Lesson Summary
- A confidence interval provides a range of plausible values for a parameter
- Structure: Point Estimate ± Margin of Error
- The confidence level (90%, 95%, 99%) indicates how often our method captures the true parameter
- Margin of error depends on confidence level, sample size, and population variability
- Interpret CIs in terms of confidence in the method, not probability about the parameter
- Higher confidence → wider intervals; larger samples → narrower intervals