Learn Without Walls

Type I and Type II Errors & Power

Understand the two types of errors in hypothesis testing and the concept of statistical power

Lesson Objectives

By the end of this lesson, you will be able to:

1. The Four Possible Outcomes

When we conduct a hypothesis test, we make a decision: either reject H₀ or fail to reject H₀. But we don't know the true state of reality—H₀ might actually be true or false. This creates four possible outcomes:

Our Decision Reality (Unknown to Us)
H₀ is Actually True H₀ is Actually False
Reject H₀ Type I Error
(False Positive)
Probability = α
Correct Decision
(True Positive)
Probability = 1 - β
(Power)
Fail to Reject H₀ Correct Decision
(True Negative)
Probability = 1 - α
Type II Error
(False Negative)
Probability = β

Two of these outcomes are correct decisions, and two are errors.

2. Type I Error (α)

Definition: Type I Error

A Type I error occurs when we reject a true null hypothesis. This is also called a false positive.

The probability of making a Type I error is denoted by α (alpha), which is the significance level we choose.

Example 1: Type I Error in Medicine

Medical test scenario:

  • H₀: Patient does not have the disease
  • Hₐ: Patient has the disease

Type I Error: The test says the patient HAS the disease (reject H₀), but the patient is actually healthy (H₀ was true).

Consequence: Unnecessary treatment, anxiety, additional testing, and medical costs for a healthy person.

Example 2: Type I Error in Criminal Justice

  • H₀: Defendant is innocent
  • Hₐ: Defendant is guilty

Type I Error: Convict an innocent person (reject H₀ when it's true).

Consequence: An innocent person goes to jail—a very serious error!

This is why the criminal justice system uses a very low α ("beyond reasonable doubt").

Controlling Type I Error:

We directly control the Type I error rate by choosing α. Common choices:

  • α = 0.05: Accept 5% chance of Type I error (standard in most research)
  • α = 0.01: Accept only 1% chance (more conservative, fewer false positives)
  • α = 0.10: Accept 10% chance (less conservative, more false positives)

3. Type II Error (β)

Definition: Type II Error

A Type II error occurs when we fail to reject a false null hypothesis. This is also called a false negative.

The probability of making a Type II error is denoted by β (beta).

Example 3: Type II Error in Medicine

Medical test scenario:

  • H₀: Patient does not have the disease
  • Hₐ: Patient has the disease

Type II Error: The test says the patient is healthy (fail to reject H₀), but the patient actually HAS the disease (H₀ was false).

Consequence: Disease goes untreated, potentially leading to serious health complications or death.

Example 4: Type II Error in Drug Development

  • H₀: New drug is no better than existing treatment
  • Hₐ: New drug is better than existing treatment

Type II Error: Conclude the new drug doesn't work (fail to reject H₀), when it actually is effective (H₀ is false).

Consequence: An effective treatment is rejected and never becomes available to patients.

Important: Unlike α, we don't directly choose β. The value of β depends on several factors including sample size, effect size, and α. However, we can reduce β by increasing sample size or using better research designs.

4. The Relationship Between α and β

Type I and Type II errors are inversely related:

The Tradeoff:
  • If you decrease α (reduce Type I error), you increase β (increase Type II error)
  • If you increase α (increase Type I error), you decrease β (reduce Type II error)

Think of it like adjusting the sensitivity of a test:

  • Very strict test (low α): Fewer false alarms, but might miss real effects
  • Lenient test (high α): Catches more real effects, but also more false alarms

Which Error is Worse?

The answer depends on context and consequences:

Scenario Worse Error Strategy
Criminal trial Type I (convict innocent) Use very low α ("beyond reasonable doubt")
Cancer screening Type II (miss cancer) Accept higher α to catch more cases
Quality control (safety) Type II (miss defect) Strict testing to avoid missing defects
Scientific research Type I (false discovery) Standard α = 0.05 to control false claims

Example 5: Choosing α Based on Consequences

Scenario A: Airport Security

  • H₀: Passenger is not a threat
  • Type I Error: Flag innocent passenger (inconvenience)
  • Type II Error: Miss actual threat (catastrophic)
  • Decision: Use higher α to minimize Type II error (better safe than sorry)

Scenario B: Spam Filter

  • H₀: Email is legitimate
  • Type I Error: Mark legitimate email as spam (might miss important message)
  • Type II Error: Let spam through (minor annoyance)
  • Decision: Use lower α to minimize Type I error (don't want to lose important emails)

5. Statistical Power (1 - β)

Definition: Power

Statistical power is the probability of correctly rejecting a false null hypothesis. It's calculated as:

Power = 1 - β

Power represents the test's ability to detect a real effect when one exists. Higher power is better!

What does power tell us?

Recommended Power: Most researchers aim for power ≥ 0.80 (80% or higher). This means accepting up to a 20% chance of a Type II error (β = 0.20).

Factors That Affect Power

Four main factors influence statistical power:

Factor Increase This... Effect on Power
Sample Size (n) Larger sample Power increases
Significance Level (α) Higher α (e.g., 0.10 vs 0.05) Power increases
(but more Type I errors)
Effect Size Larger difference from H₀ Power increases
(easier to detect big effects)
Variability (σ) Lower variability Power increases
(less noise in data)
Most Practical Way to Increase Power: Increase sample size!

We can't always control effect size or population variability, but we can often collect more data. Doubling the sample size substantially increases power.

Check Your Understanding

Question 1: A pharmaceutical company tests a new drug. What would be a Type I error and a Type II error in this scenario?

H₀: The drug is not effective
Hₐ: The drug is effective

Type I Error (α): Conclude the drug is effective (reject H₀) when it actually doesn't work (H₀ true).

Consequence: Ineffective drug goes to market, patients pay for treatment that doesn't help.

Type II Error (β): Conclude the drug doesn't work (fail to reject H₀) when it actually is effective (H₀ false).

Consequence: Effective treatment never reaches patients who could benefit.

Question 2: A study has α = 0.05 and power = 0.75. What are the probabilities of Type I error, Type II error, and correctly detecting a real effect?

Type I Error (α) = 0.05 (5%)

Type II Error (β) = 1 - Power = 1 - 0.75 = 0.25 (25%)

Correctly detecting real effect (Power) = 0.75 (75%)

This means if the null hypothesis is false (there really is an effect), we have a 75% chance of correctly rejecting it, and a 25% chance of missing it.

Question 3: A researcher wants to increase the power of their study from 0.70 to 0.85. What are three ways they could do this?

Three ways to increase power:

  1. Increase sample size: Collect data from more participants (most common approach)
  2. Increase α: Use α = 0.10 instead of 0.05 (though this increases Type I error risk)
  3. Reduce variability: Use more precise measurement tools or more homogeneous sample

Best approach: Usually increasing sample size, as it doesn't involve the tradeoffs of the other methods.

Summary

  • Type I Error (α): Rejecting a true null hypothesis (false positive)
  • Type II Error (β): Failing to reject a false null hypothesis (false negative)
  • α and β are inversely related—decreasing one increases the other
  • Power = 1 - β: The probability of correctly detecting a real effect
  • We aim for power ≥ 0.80 (at least 80%)
  • The most practical way to increase power is to increase sample size
  • Which error is worse depends on the real-world consequences of each type
← Previous: Intro to Testing Next: Tests for Means →