Lesson 1: Introduction to the Normal Distribution
Learn about the bell curve and its fundamental properties
What is a Normal Distribution?
The normal distribution (also called the Gaussian distribution or bell curve) is one of the most important probability distributions in statistics. It appears naturally in many real-world situations and forms the foundation for many statistical methods.
Definition: Normal Distribution
A normal distribution is a continuous probability distribution that is:
- Symmetric around its mean
- Bell-shaped with a single peak at the center
- Completely described by two parameters: mean (μ) and standard deviation (σ)
- Such that mean = median = mode (all at the center)
The graph of a normal distribution looks like this:
Imagine a symmetric bell-shaped curve
The highest point is at the center (the mean), and it tapers off symmetrically on both sides.
Key Properties of the Normal Distribution
1. Symmetry
The normal distribution is perfectly symmetric around its mean. This means:
- The left half is a mirror image of the right half
- 50% of the data falls below the mean, 50% above
- The mean, median, and mode are all equal
2. Bell Shape
The distribution has a distinctive bell shape:
- Single peak at the center (unimodal)
- Values near the mean are more frequent
- Values far from the mean are less frequent
- The tails extend infinitely in both directions (but probabilities get very small)
3. Determined by Two Parameters
Every normal distribution is completely described by just two values:
Parameters of the Normal Distribution:
- μ (mu) = population mean (center of the distribution)
- σ (sigma) = population standard deviation (spread of the distribution)
We write: X ~ N(μ, σ) to indicate "X follows a normal distribution with mean μ and standard deviation σ"
Example 1: Understanding Parameters
Scenario: Heights of adult women in the US follow approximately N(64, 2.5)
What this means:
- Mean height: μ = 64 inches
- Standard deviation: σ = 2.5 inches
- The distribution is bell-shaped and symmetric around 64 inches
- Most women's heights cluster around 64 inches, with fewer at extreme heights
4. How Standard Deviation Affects Shape
The standard deviation (σ) controls the spread of the distribution:
- Smaller σ: Narrower, taller bell curve (data more concentrated near mean)
- Larger σ: Wider, flatter bell curve (data more spread out)
The Empirical Rule (68-95-99.7 Rule)
For any normal distribution, approximately:
68% of data falls within 1 standard deviation of the mean
[μ − σ to μ + σ]
95% of data falls within 2 standard deviations of the mean
[μ − 2σ to μ + 2σ]
99.7% of data falls within 3 standard deviations of the mean
[μ − 3σ to μ + 3σ]
Example 2: Applying the Empirical Rule
Scenario: SAT scores are normally distributed with μ = 1050 and σ = 100
(a) What percentage of students score between 950 and 1150?
Solution:
- 950 = 1050 − 100 = μ − σ
- 1150 = 1050 + 100 = μ + σ
- This is within 1 standard deviation of the mean
- Answer: About 68%
(b) What percentage score between 850 and 1250?
Solution:
- 850 = 1050 − 200 = μ − 2σ
- 1250 = 1050 + 200 = μ + 2σ
- This is within 2 standard deviations
- Answer: About 95%
(c) What percentage score above 1150?
Solution:
- 68% score between 950 and 1150 (within 1 SD)
- By symmetry, the remaining 32% is split equally: 16% below 950, 16% above 1150
- Answer: About 16%
Example 3: Real-World Application
Scenario: Adult male systolic blood pressure is N(120, 15) mmHg
Question: A blood pressure above 150 is considered high. What percentage of adult males have high blood pressure?
Solution:
- μ = 120, σ = 15
- 150 = 120 + 30 = μ + 2σ (two standard deviations above mean)
- By Empirical Rule: 95% fall between μ − 2σ and μ + 2σ
- So 5% fall outside this range
- By symmetry: 2.5% below μ − 2σ, and 2.5% above μ + 2σ
- Answer: About 2.5% have blood pressure above 150
Why is the Normal Distribution So Important?
The normal distribution is central to statistics for several reasons:
1. Many Natural Phenomena are Normally Distributed
Examples include:
- Human heights and weights
- IQ scores and test scores
- Measurement errors in scientific experiments
- Blood pressure readings
- Annual rainfall in a region
2. Central Limit Theorem
Even when individual data points aren't normally distributed, the averages of samples tend to be normally distributed (you'll learn more about this in Module 5: Sampling Distributions).
3. Foundation for Inference
Many statistical methods (confidence intervals, hypothesis tests) are based on the normal distribution.
Check Your Understanding
Try these questions to test what you've learned in this lesson.
Question 1: A factory produces bottles with volume N(500, 5) mL. What does σ = 5 tell you?
Answer: σ = 5 mL is the standard deviation, which tells us the typical amount by which bottle volumes vary from the mean of 500 mL. Most bottles will be within 5 mL of 500 mL.
Question 2: Using the Empirical Rule, approximately what percentage of bottles have volume between 490 and 510 mL?
Answer: About 95%
Explanation: 490 = 500 − 10 = μ − 2σ and 510 = 500 + 10 = μ + 2σ. This is within 2 standard deviations, so by the Empirical Rule, approximately 95% of bottles fall in this range.
Question 3: If scores on a test are N(75, 10), what score is 1 standard deviation below the mean?
Answer: 65
Explanation: μ − σ = 75 − 10 = 65
Question 4: For the test in Question 3, approximately what percentage of students score above 95?
Answer: About 2.5%
Explanation: 95 = 75 + 20 = μ + 2σ. By the Empirical Rule, 95% fall within 2 SDs (55 to 95), leaving 5% outside. Half of that (2.5%) is above 95.
Question 5: True or False: In a normal distribution, the mean is always larger than the median.
Answer: False
Explanation: In a normal distribution, the mean, median, and mode are all equal because the distribution is symmetric.
Key Takeaways from Lesson 1
- The normal distribution is symmetric and bell-shaped
- It's completely described by mean (μ) and standard deviation (σ)
- In a normal distribution, mean = median = mode
- Empirical Rule: 68% within 1 SD, 95% within 2 SD, 99.7% within 3 SD
- Many real-world phenomena follow approximately normal distributions
- The normal distribution is the foundation for statistical inference