Confidence Intervals for Proportions
Learn to construct and interpret confidence intervals for population proportions
Lesson Objectives
By the end of this lesson, you will be able to:
- Construct confidence intervals for population proportions
- Check the success-failure condition before using normal approximation
- Find critical z-values for proportion CIs
- Calculate margin of error for proportions
- Interpret confidence intervals for proportions correctly
1. The Confidence Interval Formula for Proportions
When estimating a population proportion p (like the percentage of voters who support a candidate), we use the sample proportion p̂ and construct an interval around it.
Confidence Interval for a Population Proportion (p)
Where:
p̂ = sample proportion (point estimate)
z* = critical value from standard normal distribution
n = sample size
√(p̂(1-p̂)/n) = standard error for proportions
For proportions, we use the z-distribution (not t) because:
- The standard error formula √(p̂(1-p̂)/n) doesn't involve estimating σ separately
- By the Central Limit Theorem, p̂ is approximately normally distributed for large samples
- The conditions np̂ ≥ 10 and n(1-p̂) ≥ 10 ensure the normal approximation is valid
2. Conditions for Using Normal Approximation
Success-Failure Condition
Before constructing a CI for a proportion, check that:
This ensures we have enough successes and failures for the normal approximation to be valid.
Example 1: Checking Conditions
a) n = 200, p̂ = 0.45 (90 successes)
np̂ = 200(0.45) = 90 ≥ 10
n(1-p̂) = 200(0.55) = 110 ≥ 10
Conclusion: Conditions met. Safe to use normal approximation.
b) n = 50, p̂ = 0.10 (5 successes)
np̂ = 50(0.10) = 5 < 10
n(1-p̂) = 50(0.90) = 45 ≥ 10
Conclusion: Condition NOT met. Too few successes. Cannot use this method.
c) n = 400, p̂ = 0.62 (248 successes)
np̂ = 400(0.62) = 248 ≥ 10
n(1-p̂) = 400(0.38) = 152 ≥ 10
Conclusion: Conditions met. Safe to proceed.
3. Critical z-Values for Common Confidence Levels
Since we use the standard normal (z) distribution for proportions, here are the critical values for common confidence levels:
| Confidence Level | Critical Value (z*) |
|---|---|
| 90% | 1.645 |
| 95% | 1.960 |
| 99% | 2.576 |
For 95% CI, z* = 1.96 is the most commonly used value. Many people round it to 2 for quick calculations (the "95% rule" gives approximately ±2 standard errors).
4. Constructing Confidence Intervals for Proportions
Steps to Construct a CI for p:
- Check conditions:
- Random sample
- np̂ ≥ 10 and n(1-p̂) ≥ 10 (success-failure condition)
- Population at least 10 times larger than sample (independence)
- Calculate p̂: p̂ = x/n (where x = number of successes)
- Find critical value: Look up z* for desired confidence level
- Calculate standard error: SE = √(p̂(1-p̂)/n)
- Calculate margin of error: E = z* × SE
- Construct interval: p̂ ± E
- Interpret: State conclusion in context
Example 2: Complete CI Calculation for a Proportion
Problem: In a survey of 500 randomly selected voters, 260 said they support a new policy. Construct a 95% confidence interval for the proportion of all voters who support the policy.
Solution:
Step 1: Check conditions
- Random sample (stated)
- p̂ = 260/500 = 0.52
- np̂ = 500(0.52) = 260 ≥ 10
- n(1-p̂) = 500(0.48) = 240 ≥ 10
Step 2: Sample proportion
p̂ = 260/500 = 0.52
Step 3: Critical value
For 95% CI: z* = 1.96
Step 4: Standard error
SE = √(p̂(1-p̂)/n) = √(0.52 × 0.48 / 500) = √(0.0004992) = 0.02234
Step 5: Margin of error
E = z* × SE = 1.96 × 0.02234 = 0.0438 ≈ 0.044
Step 6: Construct interval
0.52 ± 0.044 = (0.476, 0.564) or (47.6%, 56.4%)
Step 7: Interpret
Conclusion: We are 95% confident that the true proportion of all voters who support the policy is between 47.6% and 56.4%.
Example 3: Higher Confidence Level
Problem: A quality control inspector finds 18 defective items in a random sample of 200 products. Construct a 99% confidence interval for the proportion of defective products.
Solution:
p̂ = 18/200 = 0.09
Check: np̂ = 200(0.09) = 18 ≥ 10 , n(1-p̂) = 200(0.91) = 182 ≥ 10
For 99% CI: z* = 2.576
SE = √(0.09 × 0.91 / 200) = √(0.0004095) = 0.02024
E = 2.576 × 0.02024 = 0.0521
CI: 0.09 ± 0.052 = (0.038, 0.142) or (3.8%, 14.2%)
Interpretation: We are 99% confident that the true proportion of defective products is between 3.8% and 14.2%.
5. Interpreting Proportion CIs
Example 4: Common Scenarios
Political Poll:
"52% ± 3% (95% CI)" means we're 95% confident the true support is between 49% and 55%.
Note: If the CI includes 50%, we cannot confidently say the candidate has majority support (could be below 50%).
Medical Test:
95% CI for disease prevalence: (0.12, 0.18)
Interpretation: Between 12% and 18% of the population likely has the disease.
Customer Satisfaction:
90% CI for customer satisfaction: (0.73, 0.81)
Interpretation: We're 90% confident that 73% to 81% of customers are satisfied.
Check Your Understanding
Question 1: Why do we use z instead of t for confidence intervals for proportions?
Answer: For proportions, the standard error formula √(p̂(1-p̂)/n) is completely determined by p̂ and n—we don't need to estimate a separate population standard deviation. The sampling distribution of p̂ follows a normal distribution (when conditions are met), so we use z.
Question 2: A sample has n = 80, p̂ = 0.08. Does this satisfy the success-failure condition?
Answer: No. np̂ = 80(0.08) = 6.4 < 10. We don't have enough successes for the normal approximation to be reliable.
Question 3: For a 95% CI with p̂ = 0.40 and n = 100, what is the standard error?
Answer: SE = √(p̂(1-p̂)/n) = √(0.40 × 0.60 / 100) = √(0.0024) = 0.049
Question 4: A 95% CI for p is (0.32, 0.48). What was the sample proportion p̂?
Answer: p̂ is the center of the interval: (0.32 + 0.48) / 2 = 0.40. The margin of error is 0.08.
Question 5: Does a wider confidence interval mean we're more or less confident in our estimate?
Answer: More confident! A wider interval (higher confidence level like 99% vs 95%) casts a wider net, making it more likely to capture the true parameter. However, it's less precise (less informative about the exact value).
Lesson Summary
- CI for p: p̂ ± z* × √(p̂(1-p̂)/n)
- Success-failure condition: np̂ ≥ 10 and n(1-p̂) ≥ 10
- Use z-distribution (not t) for proportions
- Common z* values: 1.645 (90%), 1.96 (95%), 2.576 (99%)
- Standard error for proportions: SE = √(p̂(1-p̂)/n)
- Larger samples → smaller SE → narrower intervals