Learn Without Walls

Lesson 1: Introduction to the Normal Distribution

Learn about the bell curve and its fundamental properties

What is a Normal Distribution?

The normal distribution (also called the Gaussian distribution or bell curve) is one of the most important probability distributions in statistics. It appears naturally in many real-world situations and forms the foundation for many statistical methods.

Definition: Normal Distribution

A normal distribution is a continuous probability distribution that is:

  • Symmetric around its mean
  • Bell-shaped with a single peak at the center
  • Completely described by two parameters: mean (μ) and standard deviation (σ)
  • Such that mean = median = mode (all at the center)

The graph of a normal distribution looks like this:

Imagine a symmetric bell-shaped curve

The highest point is at the center (the mean), and it tapers off symmetrically on both sides.

Key Properties of the Normal Distribution

1. Symmetry

The normal distribution is perfectly symmetric around its mean. This means:

2. Bell Shape

The distribution has a distinctive bell shape:

3. Determined by Two Parameters

Every normal distribution is completely described by just two values:

Parameters of the Normal Distribution:

  • μ (mu) = population mean (center of the distribution)
  • σ (sigma) = population standard deviation (spread of the distribution)

We write: X ~ N(μ, σ) to indicate "X follows a normal distribution with mean μ and standard deviation σ"

Example 1: Understanding Parameters

Scenario: Heights of adult women in the US follow approximately N(64, 2.5)

What this means:

  • Mean height: μ = 64 inches
  • Standard deviation: σ = 2.5 inches
  • The distribution is bell-shaped and symmetric around 64 inches
  • Most women's heights cluster around 64 inches, with fewer at extreme heights

4. How Standard Deviation Affects Shape

The standard deviation (σ) controls the spread of the distribution:

Key Insight: Two normal distributions can have the same mean but different standard deviations, making one more "spread out" than the other. Similarly, they can have different means but the same standard deviation.

The Empirical Rule (68-95-99.7 Rule)

For any normal distribution, approximately:

68% of data falls within 1 standard deviation of the mean

[μ − σ to μ + σ]

95% of data falls within 2 standard deviations of the mean

[μ − 2σ to μ + 2σ]

99.7% of data falls within 3 standard deviations of the mean

[μ − 3σ to μ + 3σ]

Memorize This! 68-95-99.7 is one of the most useful facts in all of statistics. You'll use it constantly!

Example 2: Applying the Empirical Rule

Scenario: SAT scores are normally distributed with μ = 1050 and σ = 100

(a) What percentage of students score between 950 and 1150?

Solution:

  • 950 = 1050 − 100 = μ − σ
  • 1150 = 1050 + 100 = μ + σ
  • This is within 1 standard deviation of the mean
  • Answer: About 68%

(b) What percentage score between 850 and 1250?

Solution:

  • 850 = 1050 − 200 = μ − 2σ
  • 1250 = 1050 + 200 = μ + 2σ
  • This is within 2 standard deviations
  • Answer: About 95%

(c) What percentage score above 1150?

Solution:

  • 68% score between 950 and 1150 (within 1 SD)
  • By symmetry, the remaining 32% is split equally: 16% below 950, 16% above 1150
  • Answer: About 16%

Example 3: Real-World Application

Scenario: Adult male systolic blood pressure is N(120, 15) mmHg

Question: A blood pressure above 150 is considered high. What percentage of adult males have high blood pressure?

Solution:

  • μ = 120, σ = 15
  • 150 = 120 + 30 = μ + 2σ (two standard deviations above mean)
  • By Empirical Rule: 95% fall between μ − 2σ and μ + 2σ
  • So 5% fall outside this range
  • By symmetry: 2.5% below μ − 2σ, and 2.5% above μ + 2σ
  • Answer: About 2.5% have blood pressure above 150

Why is the Normal Distribution So Important?

The normal distribution is central to statistics for several reasons:

1. Many Natural Phenomena are Normally Distributed

Examples include:

2. Central Limit Theorem

Even when individual data points aren't normally distributed, the averages of samples tend to be normally distributed (you'll learn more about this in Module 5: Sampling Distributions).

3. Foundation for Inference

Many statistical methods (confidence intervals, hypothesis tests) are based on the normal distribution.

Important Note: Not everything is normally distributed! Always check your data. Skewed distributions, outliers, or bimodal patterns indicate the normal distribution may not be appropriate.

Check Your Understanding

Try these questions to test what you've learned in this lesson.

Question 1: A factory produces bottles with volume N(500, 5) mL. What does σ = 5 tell you?

Answer: σ = 5 mL is the standard deviation, which tells us the typical amount by which bottle volumes vary from the mean of 500 mL. Most bottles will be within 5 mL of 500 mL.

Question 2: Using the Empirical Rule, approximately what percentage of bottles have volume between 490 and 510 mL?

Answer: About 95%

Explanation: 490 = 500 − 10 = μ − 2σ and 510 = 500 + 10 = μ + 2σ. This is within 2 standard deviations, so by the Empirical Rule, approximately 95% of bottles fall in this range.

Question 3: If scores on a test are N(75, 10), what score is 1 standard deviation below the mean?

Answer: 65

Explanation: μ − σ = 75 − 10 = 65

Question 4: For the test in Question 3, approximately what percentage of students score above 95?

Answer: About 2.5%

Explanation: 95 = 75 + 20 = μ + 2σ. By the Empirical Rule, 95% fall within 2 SDs (55 to 95), leaving 5% outside. Half of that (2.5%) is above 95.

Question 5: True or False: In a normal distribution, the mean is always larger than the median.

Answer: False

Explanation: In a normal distribution, the mean, median, and mode are all equal because the distribution is symmetric.

Key Takeaways from Lesson 1

← Back to Module 4 Next: Lesson 2 - z-scores →