Save or print this lesson:

Sampling Distribution of Proportions

Apply CLT concepts to categorical data and sample proportions

Lesson Objectives

By the end of this lesson, you will be able to:

Understand the sampling distribution of sample proportions (p̂)
Apply the success-failure condition for normal approximation
Calculate the mean and standard error for sample proportions
Find probabilities involving sample proportions

1. Introduction to Sample Proportions

Sample Proportion

The sample proportion p̂ (pronounced "p-hat") is the fraction of observations in a sample that have a particular characteristic.

p̂ = x / n

where x = number of "successes" in the sample, n = sample size

Just like sample means (x̄) estimate population means (μ), sample proportions (p̂) estimate population proportions (p).

Example 1: Calculating Sample Proportion

In a random sample of 200 voters, 108 support Candidate A. What is the sample proportion?

Solution:

x = 108 (voters who support Candidate A)
n = 200 (total voters in sample)
p̂ = x/n = 108/200 = 0.54

Answer: p̂ = 0.54 or 54% of the sample supports Candidate A.

2. Sampling Distribution of p̂

Just as different samples produce different sample means (x̄), different samples produce different sample proportions (p̂). The sampling distribution of p̂ describes how p̂ varies across all possible samples.

Properties of the Sampling Distribution of p̂:

Center: μₚ̂ = p (the true population proportion)
Spread: σₚ̂ = √(p(1-p)/n) (standard error)
Shape: Approximately normal when certain conditions are met

Standard Error for Sample Proportion:

σₚ̂ = √(p(1-p) / n)

Notice: Unlike the SE for means (σ/√n), the SE for proportions depends on p itself. The variability is highest when p = 0.5 and lower when p is near 0 or 1.

3. Conditions for Normal Approximation

The sampling distribution of p̂ is approximately normal when the success-failure condition is met:

Success-Failure Condition:

Both of these must be true:

np ≥ 10 (expected number of successes)
n(1-p) ≥ 10 (expected number of failures)

If both conditions are met, then p̂ ~ N(p, √(p(1-p)/n))

Example 2: Checking the Success-Failure Condition

Determine whether normal approximation is appropriate for p̂ in these situations:

(a) p = 0.3, n = 50

np = 50(0.3) = 15 ≥ 10
n(1-p) = 50(0.7) = 35 ≥ 10
Result: Normal approximation is appropriate

(b) p = 0.05, n = 100

np = 100(0.05) = 5 < 10
n(1-p) = 100(0.95) = 95 ≥ 10
Result: Normal approximation is NOT appropriate (first condition fails)

(c) p = 0.2, n = 80

np = 80(0.2) = 16 ≥ 10
n(1-p) = 80(0.8) = 64 ≥ 10
Result: Normal approximation is appropriate

4. Finding Probabilities with Sample Proportions

When conditions are met, we can use the normal distribution to find probabilities about sample proportions.

z-score for Sample Proportion:

z = (p̂ - p) / √(p(1-p)/n)

Example 3: Probability Involving Sample Proportion

In a large city, 35% of residents support a new tax proposal (p = 0.35). A random sample of 200 residents is selected. What is the probability that the sample proportion supporting the tax is between 0.30 and 0.40?

Solution:

Step 1: Check conditions

np = 200(0.35) = 70 ≥ 10
n(1-p) = 200(0.65) = 130 ≥ 10
Normal approximation is appropriate

Step 2: Find sampling distribution

μₚ̂ = p = 0.35
σₚ̂ = √(p(1-p)/n) = √(0.35 × 0.65 / 200) = √(0.2275/200) = √0.0011375 ≈ 0.0337

Step 3: Calculate z-scores

For p̂ = 0.30: z = (0.30 - 0.35) / 0.0337 = -0.05 / 0.0337 ≈ -1.48
For p̂ = 0.40: z = (0.40 - 0.35) / 0.0337 = 0.05 / 0.0337 ≈ 1.48

Step 4: Find probability

P(-1.48 < z < 1.48) = P(z < 1.48) - P(z < -1.48)
= 0.9306 - 0.0694 = 0.8612

Answer: There's about an 86.12% chance that the sample proportion will be between 0.30 and 0.40.

Example 4: Finding Unusual Sample Proportions

A company claims that 10% of its products are defective (p = 0.10). A quality inspector takes a random sample of 400 products and finds 52 defective items (p̂ = 0.13). Is this sample result unusually high if the company's claim is true?

Solution:

Step 1: Check conditions

np = 400(0.10) = 40 ≥ 10
n(1-p) = 400(0.90) = 360 ≥ 10

Step 2: Find sampling distribution

μₚ̂ = 0.10
σₚ̂ = √(0.10 × 0.90 / 400) = √(0.09/400) = √0.000225 = 0.015

Step 3: Calculate z-score

z = (0.13 - 0.10) / 0.015 = 0.03 / 0.015 = 2.0

Step 4: Find probability

P(p̂ ≥ 0.13) = P(z ≥ 2.0) = 1 - 0.9772 = 0.0228

Answer: Only about 2.28% of samples would have p̂ ≥ 0.13 if the true proportion is 0.10. This is unusual! The inspector might question the company's claim.

Example 5: Finding Required Sample Size

A pollster wants to estimate the proportion of voters supporting a candidate (assume p ≈ 0.5). What sample size is needed so that the standard error is no more than 0.02?

Solution:

We want: σₚ̂ ≤ 0.02

√(p(1-p)/n) ≤ 0.02

Using p = 0.5 (worst case, maximum variability):

√(0.5 × 0.5 / n) ≤ 0.02

√(0.25/n) ≤ 0.02

0.25/n ≤ 0.0004

0.25 ≤ 0.0004n

n ≥ 0.25/0.0004 = 625

Answer: The pollster needs at least n = 625 voters in the sample.

Check Your Understanding

Question 1: In a sample of 150 students, 45 are left-handed. What is the sample proportion?

Answer: p̂ = 0.30 or 30%

Calculation: p̂ = x/n = 45/150 = 0.30

Question 2: For p = 0.4 and n = 60, verify that normal approximation is appropriate for the sampling distribution of p̂.

Answer: Yes, normal approximation is appropriate.

Check:

np = 60(0.4) = 24 ≥ 10
n(1-p) = 60(0.6) = 36 ≥ 10

Both conditions are satisfied.

Question 3: If p = 0.6 and n = 100, what is the standard error of the sample proportion?

Answer: σₚ̂ = 0.049 or about 0.05

Calculation:

σₚ̂ = √(p(1-p)/n)
= √(0.6 × 0.4 / 100)
= √(0.24/100)
= √0.0024 ≈ 0.049

Question 4: True or False: The standard error for proportions is largest when p = 0.5.

Answer: True

Explanation: The expression p(1-p) is maximized when p = 0.5, giving p(1-p) = 0.5 × 0.5 = 0.25. When p is near 0 or 1, p(1-p) is smaller, resulting in less variability in sample proportions.

Question 5: Nationally, 25% of college students work full-time. In a random sample of 200 students, what's the probability that fewer than 20% work full-time?

Answer: P(p̂ < 0.20) ≈ 0.0516 or 5.16%

Solution:

Check: np = 200(0.25) = 50 , n(1-p) = 150
σₚ̂ = √(0.25 × 0.75 / 200) = √(0.1875/200) ≈ 0.0306
z = (0.20 - 0.25) / 0.0306 = -0.05 / 0.0306 ≈ -1.63
P(z < -1.63) ≈ 0.0516

       Lesson Summary
      Sample proportion: p̂ = x/n estimates population proportion p
Sampling distribution of p̂: Mean = p, SE = √(p(1-p)/n)
Success-failure condition: np ≥ 10 AND n(1-p) ≥ 10 for normal approximation
z-score: z = (p̂ - p) / √(p(1-p)/n)
SE is largest when p = 0.5 (maximum variability)
Use normal distribution methods when conditions are met

    

← Previous: Central Limit Theorem Next: Standard Error →