Save or print this lesson:

The Central Limit Theorem

One of the most powerful theorems in all of statistics

Lesson Objectives

By the end of this lesson, you will be able to:

State the Central Limit Theorem
Identify when the CLT applies (conditions)
Calculate the mean and standard error of the sampling distribution
Use the CLT to find probabilities about sample means

1. Statement of the Central Limit Theorem

The Central Limit Theorem (CLT)

For a population with mean μ and standard deviation σ, the sampling distribution of the sample mean x̄ from samples of size n will be approximately normal with:

Mean: μₓ̄ = μ
Standard deviation (Standard Error): σₓ̄ = σ/√n

This approximation improves as the sample size n increases, and is generally considered good for n ≥ 30.

Why is this amazing?

The CLT says that regardless of the population's shape (skewed, bimodal, uniform, etc.), the distribution of sample means will be approximately normal if the sample size is large enough!

This allows us to use normal distribution tools (z-scores, probabilities) even when the population isn't normal.

2. Conditions for the Central Limit Theorem

The CLT works best when:

CLT Conditions:

Random sampling: Samples are randomly selected from the population
Independence: Individual observations are independent
- For sampling without replacement: population should be at least 10 times the sample size (10n rule)
Sample size: One of these must be true:
- n ≥ 30 (works for most populations), OR
- The population is already normally distributed (then any n works)

Rule of thumb: If the population is strongly skewed or has extreme outliers, you may need n > 30 (sometimes n ≥ 50 or more) for the CLT to work well.

3. Shape, Center, and Spread of Sampling Distribution

Center: μₓ̄ = μ

The mean of the sampling distribution equals the population mean. Sample means center around the true population mean—they're unbiased estimates.

Spread: σₓ̄ = σ/√n (Standard Error)

Standard Error Formula:

σₓ̄ = σ / √n

where σ = population standard deviation, n = sample size

The standard error (SE) measures how much sample means vary from sample to sample. Notice: As n increases, SE decreases. Larger samples give more precise estimates!

Key Insight:

Doubling the sample size doesn't halve the SE
To cut SE in half, you need to quadruple the sample size (because of √n)
Example: If n = 100 gives SE = 2, then n = 400 gives SE = 1

Shape: Approximately Normal

When n is large enough (typically n ≥ 30), the sampling distribution of x̄ is approximately normal, regardless of the population's shape.

Example 1: Finding the Sampling Distribution

A population of exam scores has μ = 75 and σ = 12. We take random samples of size n = 36. Describe the sampling distribution of x̄.

Solution:

Check conditions:
- Assume random sampling
- n = 36 ≥ 30
- CLT applies!
Shape: Approximately normal (by CLT)
Center: μₓ̄ = μ = 75
Spread: σₓ̄ = σ/√n = 12/√36 = 12/6 = 2

Answer: The sampling distribution of x̄ is approximately normal with mean 75 and standard error 2. We write: x̄ ~ N(75, 2).

4. Working with Sampling Distributions

Once we know the sampling distribution is approximately normal, we can use z-scores and the normal distribution to find probabilities!

z-score for Sample Mean:

z = (x̄ - μₓ̄) / σₓ̄ = (x̄ - μ) / (σ/√n)

Example 2: Finding Probability for a Sample Mean

Weights of apples have μ = 150 grams and σ = 20 grams. If we select a random sample of 64 apples, what is the probability that the sample mean weight is less than 145 grams?

Solution:

Step 1: Check CLT conditions

Random sample
n = 64 ≥ 30
CLT applies: x̄ ~ N(μ, σ/√n)

Step 2: Find the sampling distribution

μₓ̄ = μ = 150
σₓ̄ = σ/√n = 20/√64 = 20/8 = 2.5
x̄ ~ N(150, 2.5)

Step 3: Calculate z-score

z = (x̄ - μ) / (σ/√n) = (145 - 150) / 2.5 = -5 / 2.5 = -2.0

Step 4: Find probability

P(x̄ < 145) = P(z < -2.0) = 0.0228 (from z-table)

Answer: There's about a 2.28% chance the sample mean will be less than 145 grams. This is unlikely!

Example 3: Finding a Range of Sample Means

SAT scores have μ = 1050 and σ = 200. For random samples of 100 students, what range contains the middle 95% of sample means?

Solution:

Step 1: Find sampling distribution

μₓ̄ = 1050
σₓ̄ = 200/√100 = 200/10 = 20

Step 2: Find z-scores for middle 95%

Middle 95% means 2.5% in each tail → z = ±1.96

Step 3: Convert z-scores to x̄ values

Lower bound: x̄ = μ + z·σₓ̄ = 1050 + (-1.96)(20) = 1050 - 39.2 = 1010.8
Upper bound: x̄ = μ + z·σₓ̄ = 1050 + (1.96)(20) = 1050 + 39.2 = 1089.2

Answer: The middle 95% of sample means fall between 1010.8 and 1089.2.

Interpretation: If we repeatedly take samples of 100 students, 95% of the time the sample mean will be between 1010.8 and 1089.2.

Example 4: Comparing Individual Values to Sample Means

Heights of adult men have μ = 70 inches and σ = 3 inches. Compare:

(a) Probability that one randomly selected man is taller than 73 inches
(b) Probability that the mean height of 25 randomly selected men exceeds 73 inches

Solution:

Part (a): Individual value

z = (x - μ) / σ = (73 - 70) / 3 = 1.0
P(x > 73) = P(z > 1.0) = 1 - 0.8413 = 0.1587 ≈ 15.87%

Part (b): Sample mean

σₓ̄ = σ/√n = 3/√25 = 3/5 = 0.6
z = (x̄ - μ) / σₓ̄ = (73 - 70) / 0.6 = 3 / 0.6 = 5.0
P(x̄ > 73) = P(z > 5.0) ≈ 0.0000003 (essentially 0)

Answer:

(a) About 16% chance for one man to exceed 73 inches
(b) Essentially 0% chance for the mean of 25 men to exceed 73 inches

Key insight: Sample means vary much less than individual values! The SE (0.6) is much smaller than σ (3).

Example 5: Required Sample Size

A population has μ = 100 and σ = 24. What sample size is needed so that the standard error is no more than 3?

Solution:

We want: σₓ̄ ≤ 3

σ/√n ≤ 3

24/√n ≤ 3

24 ≤ 3√n

8 ≤ √n

64 ≤ n

Answer: We need at least n = 64 observations.

Check Your Understanding

Question 1: A population has mean 50 and standard deviation 8. For samples of size 64, what is the mean and standard error of the sampling distribution of x̄?

Answer: μₓ̄ = 50, σₓ̄ = 1

Explanation:

μₓ̄ = μ = 50
σₓ̄ = σ/√n = 8/√64 = 8/8 = 1

Question 2: True or False: The Central Limit Theorem says that if we take a large enough sample, the population distribution will be approximately normal.

Answer: False

Explanation: The CLT says the sampling distribution of x̄ will be approximately normal, not the population distribution. The population can have any shape; the CLT tells us about the distribution of sample means, not individual values.

Question 3: If you want to reduce the standard error by half, by what factor must you increase the sample size?

Answer: Multiply sample size by 4

Explanation: Since σₓ̄ = σ/√n, to cut SE in half:

Original: σₓ̄ = σ/√n
Want: σₓ̄/2 = σ/√(new n)
σ/(2√n) = σ/√(new n)
√(new n) = 2√n
new n = 4n

Question 4: A population has μ = 80 and σ = 15. For samples of size 100, find P(x̄ > 82).

Answer: P(x̄ > 82) ≈ 0.0918 or 9.18%

Solution:

σₓ̄ = 15/√100 = 15/10 = 1.5
z = (82 - 80) / 1.5 = 2 / 1.5 = 1.33
P(z > 1.33) = 1 - 0.9082 = 0.0918

Question 5: When can we NOT use the Central Limit Theorem?

Answer: When:

The sample is not random
Observations are not independent
Sample size is too small (n < 30) AND population is not normal
Population has extreme outliers/skewness and n is only slightly above 30

       Lesson Summary
      The Central Limit Theorem states that x̄ ~ N(μ, σ/√n) when n is large (n ≥ 30)
CLT works regardless of population shape (if n is large enough)
Standard Error: σₓ̄ = σ/√n measures variability of sample means
Sample means vary less than individual values (smaller SE)
Use z = (x̄ - μ) / (σ/√n) to find probabilities about sample means
Larger samples give smaller SE and more precise estimates

    

← Previous: Intro to Sampling Distributions Next: Sampling Proportions →