Determining Sample Size
Learn how to calculate the sample size needed to achieve a desired margin of error
Lesson Objectives
By the end of this lesson, you will be able to:
- Explain why sample size matters for precision
- Calculate required sample size for estimating means
- Calculate required sample size for estimating proportions
- Use conservative estimates when no prior information is available
- Understand trade-offs between precision and cost
1. Why Sample Size Matters
The margin of error in a confidence interval depends directly on sample size. Larger samples give more precise estimates (narrower intervals), but they also cost more in time and money.
"How large a sample do I need to estimate a parameter within a desired margin of error at a given confidence level?"
Example 1: Why Sample Size Planning Matters
Scenario: A company wants to estimate average customer satisfaction within ±0.2 points on a 10-point scale with 95% confidence.
- If they sample n = 50: margin of error might be ±0.4 (too imprecise)
- If they sample n = 200: margin of error might be ±0.2 (perfect!)
- If they sample n = 1000: margin of error might be ±0.1 (more precise than needed, wasted resources)
Goal: Calculate the exact sample size needed to meet the requirement (n = 200).
2. Sample Size for Estimating Means
To estimate a population mean μ within a desired margin of error E with a certain confidence level, we rearrange the margin of error formula and solve for n.
Sample Size Formula for Means
Where:
z* = critical value for desired confidence level
σ = population standard deviation (or estimate from pilot study)
E = desired margin of error
Note: Always round UP to the next whole number (can't sample 0.3 of a person!)
Steps to Calculate Sample Size for Means:
- Determine desired confidence level → find z*
- Determine desired margin of error (E)
- Estimate population standard deviation (σ):
- Use value from prior research or pilot study
- Use range/4 as rough estimate if no other info
- Calculate n = (z*σ / E)²
- Round UP to next whole number
Example 2: Sample Size for Estimating Mean GPA
Problem: A university wants to estimate the mean GPA of all students within ±0.05 points with 95% confidence. A previous study found σ ≈ 0.40. How many students should be sampled?
Solution:
Given:
- Desired margin of error: E = 0.05
- Confidence level: 95% → z* = 1.96
- Estimated σ = 0.40
Calculate:
n = (z*σ / E)²
n = (1.96 × 0.40 / 0.05)²
n = (0.784 / 0.05)²
n = (15.68)²
n = 245.86
Round up: n = 246 students
Answer: The university should sample 246 students to estimate mean GPA within ±0.05 with 95% confidence.
Example 3: Effect of Confidence Level on Sample Size
Problem: Compare sample sizes needed to estimate mean income within $500 if σ = $3000 at different confidence levels.
a) 90% confidence (z* = 1.645):
n = (1.645 × 3000 / 500)² = (9.87)² = 97.4 → 98 people
b) 95% confidence (z* = 1.96):
n = (1.96 × 3000 / 500)² = (11.76)² = 138.3 → 139 people
c) 99% confidence (z* = 2.576):
n = (2.576 × 3000 / 500)² = (15.456)² = 238.9 → 239 people
Observation: Higher confidence requires larger samples. Going from 90% to 99% more than doubles the required sample size!
3. Sample Size for Estimating Proportions
To estimate a population proportion p within a desired margin of error E, we use a similar approach.
Sample Size Formula for Proportions
Where:
p̂ = estimated proportion (from pilot study or prior research)
z* = critical value for desired confidence level
E = desired margin of error (as a proportion, not percentage)
When You Don't Have a Prior Estimate
When you have no prior information about p, use p̂ = 0.5 (50%). This gives the maximum possible sample size because p̂(1-p̂) is maximized at p̂ = 0.5.
This ensures your sample will be large enough regardless of the true value of p.
Example 4: Sample Size for Proportion with Prior Estimate
Problem: A marketing team wants to estimate the proportion of customers who would buy a new product within ±3% with 95% confidence. A pilot study suggests p ≈ 0.35. How many customers should they survey?
Solution:
Given:
- E = 0.03 (3% as a proportion)
- Confidence: 95% → z* = 1.96
- p̂ = 0.35 (from pilot study)
Calculate:
n = p̂(1-p̂) × (z*/E)²
n = 0.35(0.65) × (1.96/0.03)²
n = 0.2275 × (65.33)²
n = 0.2275 × 4268.21
n = 971.0
Answer: They should survey 971 customers.
Example 5: Sample Size WITHOUT Prior Estimate
Problem: A political campaign wants to estimate voter support within ±2% with 95% confidence, but has no prior data. How many voters should they poll?
Solution (Conservative Approach):
Since no prior estimate, use p̂ = 0.5
E = 0.02, z* = 1.96
n = 0.5(0.5) × (1.96/0.02)²
n = 0.25 × (98)²
n = 0.25 × 9604
n = 2401
Answer: They should poll 2,401 voters.
Note: This is why political polls with ±2-3% margins typically have sample sizes of 1000-2500 people!
4. Trade-offs: Precision vs. Cost
| Factor | Effect on Sample Size | Cost Implication |
|---|---|---|
| Smaller margin of error | Larger n needed | Higher cost |
| Higher confidence level | Larger n needed | Higher cost |
| More variable population (larger σ) | Larger n needed | Higher cost |
- Cutting margin of error in half requires 4 times the sample size
- Researchers must balance precision (small E) with budget constraints
- Common practice: Use 95% confidence with E chosen based on budget
- Pilot studies help estimate σ or p to avoid oversampling
Example 6: Comparing Costs
A survey costs $5 per respondent. Compare total costs for different margins of error:
| Margin of Error | Sample Size (n) | Total Cost |
|---|---|---|
| ±4% | 600 | $3,000 |
| ±3% | 1,067 | $5,335 |
| ±2% | 2,401 | $12,005 |
| ±1% | 9,604 | $48,020 |
Observation: Going from ±4% to ±1% margin increases cost by 16 times! Researchers must decide if extra precision is worth the cost.
Check Your Understanding
Question 1: If you want to cut the margin of error in half, by what factor must you increase the sample size?
Answer: Multiply by 4. Since n appears under a square root in the margin of error formula, cutting E in half requires multiplying n by 2² = 4.
Question 2: When estimating a proportion with no prior information, what value should you use for p̂?
Answer: Use p̂ = 0.5. This gives the maximum possible sample size and guarantees your sample will be large enough regardless of the true proportion.
Question 3: You need n = 384.16. How many people should you actually sample?
Answer: 385 people. Always round UP to the next whole number to ensure you meet the desired margin of error.
Question 4: For E = 0.04 and z* = 1.96, what sample size is needed for a proportion (using conservative approach)?
Answer: n = 0.25 × (1.96/0.04)² = 0.25 × (49)² = 0.25 × 2401 = 600.25 → 601 people
Question 5: Why does higher confidence require a larger sample size?
Answer: Higher confidence means larger z*, which increases the numerator in the sample size formula. To be more confident we've captured the parameter, we need to collect more data.
Lesson Summary
- Sample size for means: n = (z*σ / E)²
- Sample size for proportions: n = p̂(1-p̂) × (z*/E)²
- Use p̂ = 0.5 for proportions when no prior estimate available (conservative)
- Always round UP to next whole number
- Smaller margin of error → larger sample needed (quadratic relationship)
- Higher confidence → larger sample needed
- Must balance precision with cost in real-world applications