Module 4 Study Guide
Discrete Probability Distributions
Free Statistics Learning Platform • Safaa Dabagh
1. Random Variables
Types of Random Variables
Discrete Random Variables
Definition: Takes on a countable set of distinct values
- Examples: Number of heads in coin flips, number of students in a class, number of defective items
- Can list all possible values: 0, 1, 2, 3, ...
- Gaps between values (no values between 2 and 3)
Continuous Random Variables
Definition: Takes on any value in an interval (infinite possibilities)
- Examples: Height, weight, temperature, time
- Can't list all possible values
- Infinitely many values between any two numbers
2. Probability Distributions
Requirements for Valid Probability Distribution
The sum of all probabilities equals exactly 1.0: Σ P(X = x) = 1
Example: US Household Size Distribution
| Household Size (X) | Probability P(X) |
|---|---|
| 1 person | 0.28 |
| 2 people | 0.34 |
| 3 people | 0.15 |
| 4 people | 0.13 |
| 5+ people | 0.10 |
| Total | 1.00 |
Visualization of Probability Distributions
A probability histogram shows:
- X-axis: Values of the random variable
- Y-axis: Probabilities (heights of bars)
- Bar height = Probability of that value
3. Expected Value (Mean)
Formula for Expected Value
E(X) = Σ [x · P(X = x)]
Interpretation: Multiply each value by its probability, then sum all products.
Fully Worked Example: US Household Size
Step 1: Multiply each value by its probability
| Size (x) | P(X = x) | x · P(X = x) |
|---|---|---|
| 1 | 0.28 | 1 × 0.28 = 0.28 |
| 2 | 0.34 | 2 × 0.34 = 0.68 |
| 3 | 0.15 | 3 × 0.15 = 0.45 |
| 4 | 0.13 | 4 × 0.13 = 0.52 |
| 5+ | 0.10 | 5 × 0.10 = 0.50 |
Step 2: Sum all the products
E(X) = 0.28 + 0.68 + 0.45 + 0.52 + 0.50 = 2.43
Interpretation: The average US household has 2.43 people. This doesn't mean any household has exactly 2.43 people—it's the average across all households.
4. Variance and Standard Deviation
Formulas for Variance and Standard Deviation
Var(X) = Σ [(x − E(X))² · P(X = x)]
Alternative form: Var(X) = E(X²) − [E(X)]²
SD(X) = √[Var(X)]
Standard Deviation: The square root of variance (easier to interpret!)
Interpreting Variance and Standard Deviation
What they tell us:
- Low variance/SD: Values are close to the expected value (clustered)
- High variance/SD: Values are spread far from the expected value (scattered)
• Household sizes typically vary from the average by about 1.70 people
• The distribution is somewhat spread out (not tightly clustered)
5. The Binomial Distribution
Four Requirements for a Binomial Experiment
- Fixed number of trials (n): The number of trials is predetermined
- Two outcomes per trial: Each trial results in either "success" or "failure"
- Constant probability (p): The probability of success is the SAME for each trial
- Independent trials: The outcome of one trial doesn't affect another
Binomial Parameters
- n: Number of trials
- p: Probability of success on each trial (between 0 and 1)
- X: Number of successes (can be 0, 1, 2, ..., n)
Expected Value and Variance of Binomial
E(X) = n × p
Expected number of successes in n trials
Var(X) = n × p × (1 − p)
Also written as: Var(X) = n × p × q, where q = (1 − p)
Binomial Distribution Examples
Flip a fair coin 10 times. Let X = number of heads.
- n = 10 (10 trials)
- p = 0.5 (probability of heads on each flip)
- E(X) = 10 × 0.5 = 5 (expect 5 heads on average)
- Var(X) = 10 × 0.5 × 0.5 = 2.5
- SD(X) = √2.5 ≈ 1.58
Interpretation: In 10 flips, expect about 5 heads, with variation of about 1.58 heads.
A manufacturer produces items with 2% defect rate. Inspect 50 items. Let X = number of defects.
- n = 50 (50 items)
- p = 0.02 (2% defect rate)
- E(X) = 50 × 0.02 = 1 (expect 1 defective item)
- Var(X) = 50 × 0.02 × 0.98 = 0.98
- SD(X) = √0.98 ≈ 0.99
Interpretation: In 50 items, expect about 1 defect, with variation of about 1 defect.
Shape of Binomial Distribution
The shape depends on both n and p:
- p = 0.5: Symmetric distribution (bell-shaped)
- p < 0.5: Right-skewed (tail extends to right)
- p > 0.5: Left-skewed (tail extends to left)
- Large n: Distribution becomes more normal-shaped
Normal Approximation to Binomial
The binomial distribution is approximately normal when BOTH conditions are met:
np ≥ 10 AND n(1 − p) ≥ 10
When these conditions are satisfied, we can use normal distribution calculations (z-scores) to approximate binomial probabilities.
• np = 50 × 0.3 = 15 ≥ 10
• n(1-p) = 50 × 0.7 = 35 ≥ 10
→ Normal approximation is appropriate!
Quick Reference: All Formulas
Expected Value (General Discrete Distribution)
E(X) = Σ [x · P(X = x)]
Variance and Standard Deviation
Var(X) = Σ [(x − E(X))² · P(X = x)]
SD(X) = √[Var(X)]
Binomial Distribution
E(X) = n × p
Var(X) = n × p × (1 − p)
SD(X) = √[n × p × (1 − p)]
Requirements: Fixed n, two outcomes, constant p, independent trials
Normal Approximation Condition
Use normal approximation when: np ≥ 10 AND n(1 − p) ≥ 10
Key Concepts to Remember
2. A probability distribution must have probabilities summing to 1.0
3. Expected value is the long-run average (mean)
4. Variance/SD measure spread; SD is easier to interpret
5. Binomial requires fixed n, two outcomes, constant p
6. E(X) and Var(X) have simple formulas for binomial: np and np(1-p)
7. Check np ≥ 10 and n(1-p) ≥ 10 before using normal approximation
Module 4: Discrete Probability Distributions
Free Statistics Learning Platform • Safaa Dabagh • sdabagh.github.io
© 2025 • Part of UCLA Dissertation Research