1. Parameters vs. Statistics
Parameter: A numerical summary of a population. Fixed but usually unknown.
Statistic: A numerical summary of a sample. Calculated from data, varies from sample to sample.
Notation
| Measure |
Population Parameter |
Sample Statistic |
| Mean |
μ (mu) |
x̄ (x-bar) |
| Standard Deviation |
σ (sigma) |
s |
| Proportion |
p |
p̂ (p-hat) |
| Variance |
σ² |
s² |
Remember: Greek letters (μ, σ, p) = POPULATION parameters
Roman letters (x̄, s, p̂) = SAMPLE statistics
2. Sampling Distributions
Sampling Distribution: The probability distribution of a sample statistic (like x̄ or p̂) based on all possible samples of size n from the population.
Important Distinction
- Population distribution: Distribution of individual values in the population
- Sample distribution: Distribution of individual values in ONE sample
- Sampling distribution: Distribution of a statistic across ALL possible samples
Sampling Variability: Different random samples produce different statistics. This is normal and expected!
3. The Central Limit Theorem (CLT)
Central Limit Theorem: For a population with mean μ and standard deviation σ, the sampling distribution of x̄ from samples of size n will be approximately normal with:
- Mean: μₓ̄ = μ
- Standard deviation (SE): σₓ̄ = σ/√n
This approximation improves as n increases, and is generally good for
n ≥ 30.
CLT Conditions
- Random sampling: Samples randomly selected from population
- Independence: Individual observations independent (10n rule for sampling without replacement)
- Sample size: Either n ≥ 30 OR population is already normal
Why CLT is Amazing: Regardless of population shape (skewed, bimodal, uniform), the distribution of sample means will be approximately normal if n is large enough!
Properties of Sampling Distribution of x̄
4. Standard Error
Standard Error (SE): The standard deviation of a sampling distribution. Measures the typical distance between a sample statistic and the population parameter.
Standard Error vs. Standard Deviation
| Aspect |
Standard Deviation (σ) |
Standard Error (SE) |
| What it measures |
Variability of individuals |
Variability of sample means |
| Affected by n? |
No (population SD is fixed) |
Yes (SE = σ/√n) |
| Interpretation |
How spread out the data is |
Precision of estimates |
Sample Size Effect: To cut SE in half, you must multiply n by 4 (because of √n)
Standard Error Formulas
5. Sampling Distribution of Proportions
Sample Proportion: p̂ = x/n, where x = number of "successes" and n = sample size
Properties of Sampling Distribution of p̂
Success-Failure Condition
For normal approximation to be appropriate:
np ≥ 10 AND n(1-p) ≥ 10
SE for proportions is largest when p = 0.5 (maximum variability)
6. Step-by-Step Procedures
Finding Probability about Sample Mean
- Check CLT conditions
- Random sample?
- n ≥ 30 OR population normal?
- Find sampling distribution
- Calculate z-score
- Find probability
- Use z-table or calculator
Finding Probability about Sample Proportion
- Check success-failure condition
- Find sampling distribution
- μₚ̂ = p
- σₚ̂ = √(p(1-p)/n)
- Calculate z-score
- z = (p̂ - p) / √(p(1-p)/n)
- Find probability
- Use z-table or calculator
7. Example Problems
Example 1: CLT Application
Problem: A population has μ = 100, σ = 20. For n = 64, find P(x̄ > 105).
Solution:
- Check: n = 64 ≥ 30
- μₓ̄ = 100, σₓ̄ = 20/8 = 2.5
- z = (105-100)/2.5 = 2.0
- P(z > 2.0) = 1 - 0.9772 = 0.0228
Answer: 2.28% chance
Example 2: Sample Proportions
Problem: If p = 0.40 and n = 200, find P(p̂ < 0.35).
Solution:
- Check: np = 80 ≥ 10 , n(1-p) = 120 ≥ 10
- SE = √(0.40 × 0.60 / 200) = √0.0012 ≈ 0.0346
- z = (0.35 - 0.40) / 0.0346 ≈ -1.45
- P(z < -1.45) ≈ 0.0735
Answer: About 7.35% chance
Example 3: Required Sample Size
Problem: To achieve SE ≤ 2 when σ = 20, what n is needed?
Solution:
- Want: σ/√n ≤ 2
- 20/√n ≤ 2
- √n ≥ 10
- n ≥ 100
Answer: Need at least n = 100
8. Common Mistakes to Avoid
- Confusing σ and SE
- σ = standard deviation of individuals
- SE = standard deviation of sample means
- Always use SE (not σ) when working with sample means
- Forgetting to check CLT conditions
- Always verify n ≥ 30 or population normal
- For proportions, check np ≥ 10 AND n(1-p) ≥ 10
- Using wrong formula for SE
- For means: SE = σ/√n
- For proportions: SE = √(p(1-p)/n)
- Don't mix them up!
- Not squaring when solving for n
- If √n = 10, then n = 100 (not 10!)
- Thinking larger samples change μₓ̄
- μₓ̄ = μ regardless of sample size
- Sample size affects SE, not the center
9. Quick Decision Guide
Which Distribution Should I Use?
| If you're finding probability about... |
Use... |
| One individual value (x) |
Population distribution with σ |
| Sample mean (x̄) |
Sampling distribution with SE = σ/√n |
| Sample proportion (p̂) |
Sampling distribution with SE = √(p(1-p)/n) |
When Can I Use Normal Distribution?
| For... |
Conditions |
| Individual values |
Population must be normal |
| Sample means |
n ≥ 30 OR population normal |
| Sample proportions |
np ≥ 10 AND n(1-p) ≥ 10 |
10. Formula Sheet
All Key Formulas
11. Study Tips
- Understand the "why": Don't just memorize formulas—understand why CLT works
- Master notation: Keep parameters (Greek) vs. statistics (Roman) straight
- Always check conditions: Before applying CLT, verify n ≥ 30 or normal population
- Draw pictures: Sketch normal curves to visualize what you're finding
- Practice SE calculations: Get comfortable with σ/√n and √(p(1-p)/n)
- Compare SD vs. SE: Remember SE measures precision of estimates, not data spread
- Work many examples: Sampling distributions are abstract—practice builds intuition
Module 5: Sampling Distributions | Safaa Dabagh | Free Statistics Course