Module 4 Study Guide

Discrete Probability Distributions

Free Statistics Learning Platform • Safaa Dabagh

1. Random Variables

Random Variable: A variable whose value is determined by the outcome of a random event. We use X, Y, Z to denote random variables.

Types of Random Variables

Discrete Random Variables

Definition: Takes on a countable set of distinct values

Continuous Random Variables

Definition: Takes on any value in an interval (infinite possibilities)

KEY: Module 4 focuses on DISCRETE random variables!

2. Probability Distributions

Probability Distribution: Lists all possible values of a random variable and their corresponding probabilities.

Requirements for Valid Probability Distribution

Each probability P(X = x) is between 0 and 1 (inclusive)
The sum of all probabilities equals exactly 1.0: Σ P(X = x) = 1

Example: US Household Size Distribution

Scenario: The probability distribution for US household size:
Household Size (X) Probability P(X)
1 person 0.28
2 people 0.34
3 people 0.15
4 people 0.13
5+ people 0.10
Total 1.00
Check: All probabilities sum to 1.00

Visualization of Probability Distributions

A probability histogram shows:

For household size, the bar at X=2 has height 0.34 (34% of households have 2 people).

3. Expected Value (Mean)

Expected Value, E(X): The long-run average value when a random experiment is repeated many times. It's the "theoretical mean" of a probability distribution.

Formula for Expected Value

E(X) = Σ [x · P(X = x)]

Interpretation: Multiply each value by its probability, then sum all products.

Fully Worked Example: US Household Size

Calculate E(X) using the household size distribution:

Step 1: Multiply each value by its probability

Size (x) P(X = x) x · P(X = x)
1 0.28 1 × 0.28 = 0.28
2 0.34 2 × 0.34 = 0.68
3 0.15 3 × 0.15 = 0.45
4 0.13 4 × 0.13 = 0.52
5+ 0.10 5 × 0.10 = 0.50

Step 2: Sum all the products

E(X) = 0.28 + 0.68 + 0.45 + 0.52 + 0.50 = 2.43

Interpretation: The average US household has 2.43 people. This doesn't mean any household has exactly 2.43 people—it's the average across all households.

4. Variance and Standard Deviation

Variance, Var(X): Measures the average squared distance from each value to the expected value. Shows how spread out the distribution is.

Formulas for Variance and Standard Deviation

Var(X) = Σ [(x − E(X))² · P(X = x)]

Alternative form: Var(X) = E(X²) − [E(X)]²

SD(X) = √[Var(X)]

Standard Deviation: The square root of variance (easier to interpret!)

KEY DIFFERENCE: Variance is in squared units; Standard Deviation is in the original units.

Interpreting Variance and Standard Deviation

What they tell us:

Example: If E(X) = 2.43 and SD(X) = 1.70 for household size:
• Household sizes typically vary from the average by about 1.70 people
• The distribution is somewhat spread out (not tightly clustered)

5. The Binomial Distribution

Binomial Distribution: Describes the number of successes in a fixed number of independent trials, where each trial has two outcomes (success or failure).

Four Requirements for a Binomial Experiment

  1. Fixed number of trials (n): The number of trials is predetermined
  2. Two outcomes per trial: Each trial results in either "success" or "failure"
  3. Constant probability (p): The probability of success is the SAME for each trial
  4. Independent trials: The outcome of one trial doesn't affect another

Binomial Parameters

Expected Value and Variance of Binomial

E(X) = n × p

Expected number of successes in n trials

Var(X) = n × p × (1 − p)

Also written as: Var(X) = n × p × q, where q = (1 − p)

Binomial Distribution Examples

Example 1: Coin Flipping

Flip a fair coin 10 times. Let X = number of heads.

Interpretation: In 10 flips, expect about 5 heads, with variation of about 1.58 heads.

Example 2: Quality Control

A manufacturer produces items with 2% defect rate. Inspect 50 items. Let X = number of defects.

Interpretation: In 50 items, expect about 1 defect, with variation of about 1 defect.

Shape of Binomial Distribution

The shape depends on both n and p:

Normal Approximation to Binomial

When is the binomial approximately normal?

The binomial distribution is approximately normal when BOTH conditions are met:

np ≥ 10 AND n(1 − p) ≥ 10

When these conditions are satisfied, we can use normal distribution calculations (z-scores) to approximate binomial probabilities.

Example: With n = 50, p = 0.3:
• np = 50 × 0.3 = 15 ≥ 10
• n(1-p) = 50 × 0.7 = 35 ≥ 10
→ Normal approximation is appropriate!

Quick Reference: All Formulas

Expected Value (General Discrete Distribution)

E(X) = Σ [x · P(X = x)]

Variance and Standard Deviation

Var(X) = Σ [(x − E(X))² · P(X = x)]

SD(X) = √[Var(X)]

Binomial Distribution

E(X) = n × p

Var(X) = n × p × (1 − p)

SD(X) = √[n × p × (1 − p)]

Requirements: Fixed n, two outcomes, constant p, independent trials

Normal Approximation Condition

Use normal approximation when: np ≥ 10 AND n(1 − p) ≥ 10

Key Concepts to Remember

1. A discrete random variable takes countable distinct values
2. A probability distribution must have probabilities summing to 1.0
3. Expected value is the long-run average (mean)
4. Variance/SD measure spread; SD is easier to interpret
5. Binomial requires fixed n, two outcomes, constant p
6. E(X) and Var(X) have simple formulas for binomial: np and np(1-p)
7. Check np ≥ 10 and n(1-p) ≥ 10 before using normal approximation

Module 4: Discrete Probability Distributions

Free Statistics Learning Platform • Safaa Dabagh • sdabagh.github.io

© 2025 • Part of UCLA Dissertation Research