Hypothesis Testing
Testing claims with data — the core of inferential statistics
📌 Before You Start
What you need: Modules 3 (CLT, standard error) and 4 (confidence intervals) completed. Understanding of p-values at a basic conceptual level is helpful.
What you’ll learn: The logic of null hypothesis significance testing (H&sub0; vs. Hα). What p-values actually mean. How to run one-sample and two-sample t-tests in R. Type I and Type II errors. Why statistical significance doesn’t equal practical significance.
📖 The Concept: Hypothesis Testing
Hypothesis testing gives us a formal framework for asking: "Is this result surprising enough to be convincing evidence against the default assumption?"
- H&sub0; (Null Hypothesis) — the default claim. "No effect, no difference, status quo." We assume H&sub0; is true and see if the data contradicts it.
- Hα (Alternative Hypothesis) — what we’re testing for. "There IS an effect / difference."
- p-value — P(observing results this extreme, or more, IF H&sub0; is true). Small p-value = surprising result under H&sub0;.
- α (significance level) — our threshold. Usually 0.05. If p < α, we reject H&sub0;.
| H&sub0; Is True | H&sub0; Is False | |
|---|---|---|
| Reject H&sub0; | Type I Error (α) — false positive | Correct! |
| Fail to Reject H&sub0; | Correct! | Type II Error (β) — false negative |
🔢 The t-statistic
x̄ = sample mean | μ&sub0; = hypothesized mean | s = sample SD | n = sample size
In R: t.test(data, mu = μ&sub0;) for one-sample; t.test(x, y) for two-sample
💻 In R — Worked Example (read-only)
A one-sample t-test asking whether a population mean differs from 70. R reports the t-statistic, p-value, and confidence interval all at once.
🖐️ Your Turn
Exercise 1 — One-Sample t-test: Coffee Shop
A coffee shop claims their drinks are 12 oz on average. A consumer group samples 20 drinks and finds mean = 11.6 oz, SD = 0.8 oz. Test H&sub0;: μ = 12 at α = 0.05. Report your conclusion.
Exercise 2 — Two-Sample t-test: Teaching Methods
Two teaching methods are compared. Method A (n=25) averages 82 points. Method B (n=25) averages 78 points. Is the difference statistically significant at α = 0.05?
set.seed(55) and run several times. Results may flip between significant and not-significant. Small samples + small effects = inconsistent results. That’s statistical power.Exercise 3 — Type I Error Simulation
Run 200 t-tests where H&sub0; is TRUE (both groups sampled from the same distribution). Count how many return p < 0.05. Should be about 10 (5% of 200) — those are false positives.
🧠 Brain Break
The p-value is one of the most misunderstood concepts in science. It does NOT measure the probability that H&sub0; is true.
Remember: p-value = P(data this extreme | H&sub0; true). A small p-value means "if the null were true, this result would be surprising." That’s evidence against H&sub0; — not proof.
✅ Key Takeaway
p-value < 0.05 means the result is unlikely under H&sub0; — not that H&sub0; is definitely false, and not that the effect is large or important. Statistical significance ≠ practical significance. Always report effect sizes and confidence intervals alongside p-values.
🏆 Module 5 Complete!
You now understand hypothesis testing — the foundation of scientific inference. You can run t-tests in R, interpret p-values correctly, and understand what false positives mean. Next: relationships between variables.