Type I and Type II Errors & Power
Understand the two types of errors in hypothesis testing and the concept of statistical power
Lesson Objectives
By the end of this lesson, you will be able to:
- Define Type I and Type II errors
- Explain the relationship between α and β
- Understand the concept of statistical power
- Identify factors that affect power
- Recognize real-world consequences of different types of errors
1. The Four Possible Outcomes
When we conduct a hypothesis test, we make a decision: either reject H₀ or fail to reject H₀. But we don't know the true state of reality—H₀ might actually be true or false. This creates four possible outcomes:
| Our Decision | Reality (Unknown to Us) | |
|---|---|---|
| H₀ is Actually True | H₀ is Actually False | |
| Reject H₀ | Type I Error (False Positive) Probability = α |
Correct Decision (True Positive) Probability = 1 - β (Power) |
| Fail to Reject H₀ | Correct Decision (True Negative) Probability = 1 - α |
Type II Error (False Negative) Probability = β |
Two of these outcomes are correct decisions, and two are errors.
2. Type I Error (α)
Definition: Type I Error
A Type I error occurs when we reject a true null hypothesis. This is also called a false positive.
The probability of making a Type I error is denoted by α (alpha), which is the significance level we choose.
Example 1: Type I Error in Medicine
Medical test scenario:
- H₀: Patient does not have the disease
- Hₐ: Patient has the disease
Type I Error: The test says the patient HAS the disease (reject H₀), but the patient is actually healthy (H₀ was true).
Consequence: Unnecessary treatment, anxiety, additional testing, and medical costs for a healthy person.
Example 2: Type I Error in Criminal Justice
- H₀: Defendant is innocent
- Hₐ: Defendant is guilty
Type I Error: Convict an innocent person (reject H₀ when it's true).
Consequence: An innocent person goes to jail—a very serious error!
This is why the criminal justice system uses a very low α ("beyond reasonable doubt").
We directly control the Type I error rate by choosing α. Common choices:
- α = 0.05: Accept 5% chance of Type I error (standard in most research)
- α = 0.01: Accept only 1% chance (more conservative, fewer false positives)
- α = 0.10: Accept 10% chance (less conservative, more false positives)
3. Type II Error (β)
Definition: Type II Error
A Type II error occurs when we fail to reject a false null hypothesis. This is also called a false negative.
The probability of making a Type II error is denoted by β (beta).
Example 3: Type II Error in Medicine
Medical test scenario:
- H₀: Patient does not have the disease
- Hₐ: Patient has the disease
Type II Error: The test says the patient is healthy (fail to reject H₀), but the patient actually HAS the disease (H₀ was false).
Consequence: Disease goes untreated, potentially leading to serious health complications or death.
Example 4: Type II Error in Drug Development
- H₀: New drug is no better than existing treatment
- Hₐ: New drug is better than existing treatment
Type II Error: Conclude the new drug doesn't work (fail to reject H₀), when it actually is effective (H₀ is false).
Consequence: An effective treatment is rejected and never becomes available to patients.
4. The Relationship Between α and β
Type I and Type II errors are inversely related:
- If you decrease α (reduce Type I error), you increase β (increase Type II error)
- If you increase α (increase Type I error), you decrease β (reduce Type II error)
Think of it like adjusting the sensitivity of a test:
- Very strict test (low α): Fewer false alarms, but might miss real effects
- Lenient test (high α): Catches more real effects, but also more false alarms
Which Error is Worse?
The answer depends on context and consequences:
| Scenario | Worse Error | Strategy |
|---|---|---|
| Criminal trial | Type I (convict innocent) | Use very low α ("beyond reasonable doubt") |
| Cancer screening | Type II (miss cancer) | Accept higher α to catch more cases |
| Quality control (safety) | Type II (miss defect) | Strict testing to avoid missing defects |
| Scientific research | Type I (false discovery) | Standard α = 0.05 to control false claims |
Example 5: Choosing α Based on Consequences
Scenario A: Airport Security
- H₀: Passenger is not a threat
- Type I Error: Flag innocent passenger (inconvenience)
- Type II Error: Miss actual threat (catastrophic)
- Decision: Use higher α to minimize Type II error (better safe than sorry)
Scenario B: Spam Filter
- H₀: Email is legitimate
- Type I Error: Mark legitimate email as spam (might miss important message)
- Type II Error: Let spam through (minor annoyance)
- Decision: Use lower α to minimize Type I error (don't want to lose important emails)
5. Statistical Power (1 - β)
Definition: Power
Statistical power is the probability of correctly rejecting a false null hypothesis. It's calculated as:
Power represents the test's ability to detect a real effect when one exists. Higher power is better!
What does power tell us?
- Power = 0.80 (80%): If there really is an effect, we have an 80% chance of detecting it
- Power = 0.50 (50%): Only a coin flip's chance of detecting a real effect—not good!
- Power = 0.95 (95%): Very high chance of detecting a real effect—excellent!
Factors That Affect Power
Four main factors influence statistical power:
| Factor | Increase This... | Effect on Power |
|---|---|---|
| Sample Size (n) | Larger sample | Power increases |
| Significance Level (α) | Higher α (e.g., 0.10 vs 0.05) | Power increases (but more Type I errors) |
| Effect Size | Larger difference from H₀ | Power increases (easier to detect big effects) |
| Variability (σ) | Lower variability | Power increases (less noise in data) |
We can't always control effect size or population variability, but we can often collect more data. Doubling the sample size substantially increases power.
Check Your Understanding
Question 1: A pharmaceutical company tests a new drug. What would be a Type I error and a Type II error in this scenario?
H₀: The drug is not effective
Hₐ: The drug is effective
Type I Error (α): Conclude the drug is effective (reject H₀) when it actually doesn't work (H₀ true).
Consequence: Ineffective drug goes to market, patients pay for treatment that doesn't help.
Type II Error (β): Conclude the drug doesn't work (fail to reject H₀) when it actually is effective (H₀ false).
Consequence: Effective treatment never reaches patients who could benefit.
Question 2: A study has α = 0.05 and power = 0.75. What are the probabilities of Type I error, Type II error, and correctly detecting a real effect?
Type I Error (α) = 0.05 (5%)
Type II Error (β) = 1 - Power = 1 - 0.75 = 0.25 (25%)
Correctly detecting real effect (Power) = 0.75 (75%)
This means if the null hypothesis is false (there really is an effect), we have a 75% chance of correctly rejecting it, and a 25% chance of missing it.
Question 3: A researcher wants to increase the power of their study from 0.70 to 0.85. What are three ways they could do this?
Three ways to increase power:
- Increase sample size: Collect data from more participants (most common approach)
- Increase α: Use α = 0.10 instead of 0.05 (though this increases Type I error risk)
- Reduce variability: Use more precise measurement tools or more homogeneous sample
Best approach: Usually increasing sample size, as it doesn't involve the tradeoffs of the other methods.
Summary
- Type I Error (α): Rejecting a true null hypothesis (false positive)
- Type II Error (β): Failing to reject a false null hypothesis (false negative)
- α and β are inversely related—decreasing one increases the other
- Power = 1 - β: The probability of correctly detecting a real effect
- We aim for power ≥ 0.80 (at least 80%)
- The most practical way to increase power is to increase sample size
- Which error is worse depends on the real-world consequences of each type