Session Agenda
| Time | Block | What's Happening |
| 0–5 | Hook | Show a truncated Y-axis graph. "What does this tell us?" vs. the same graph with Y-axis starting at 0. |
| 5–20 | Lesson 1–2 | Cherry-picking · Survivorship bias · Framing effects — each with a real-world example |
| 20–35 | Lesson 3 | Simpson's Paradox — two worked examples with small numbers; think step-by-step |
| 35–52 | Activity | Data Detectives: 4 case studies — identify the deception technique, find the flaw, reframe honestly |
| 52–58 | Debrief | Students write: "Which deception technique was hardest to spot? Why?" |
| 58–60 | Close | Preview S5: "Even honest data has randomness in it — next time we learn about probability." |
Tone note: Frame this session as empowerment, not cynicism. "You now have the tools to catch this" rather than "data is always lying to you." These tricks fool even trained adults — validate the difficulty.
Materials Needed
Printed Case Study cards (4 cases — see below) Worksheets (1 per student) PencilsHook graph printed or projected (truncated vs. full Y-axis) Tip: Print case study cards double-sided so groups can reference the "flaw explanation" after they've attempted to identify it themselves.
Key Vocabulary
Cherry-picking — selecting only the data that supports your conclusion, ignoring the rest
Survivorship bias — only studying the "survivors" of a process, missing those who didn't make it
Framing effect — presenting the same data differently to create different impressions
Simpson's paradox — a trend appears in separate groups but reverses when groups are combined
Confirmation bias — the tendency to seek data that confirms what you already believe
Simpson's Paradox — Two Worked Examples for Instructors
Example 1: School Improvement (simpler) | Group | School A pass rate | School B pass rate |
| Strong students | 90% (90/100) | 85% (17/20) |
| Struggling students | 30% (6/20) | 20% (20/100) |
| Overall | 80% (96/120) | 31% (37/120) |
School A is better in BOTH groups — yet the headline "School B has 31% pass rate vs. School A's 80%" seems accurate. The paradox: School A has far more struggling students in its mix. When you combine groups of very different sizes, results flip.
Example 2: Hospital Treatment Success | Patient type | Treatment A success | Treatment B success |
| Mild cases | 81% (81/100) | 87% (234/270) |
| Severe cases | 73% (192/263) | 69% (55/80) |
| Combined | 78% (273/363) | 83% (289/350) |
Treatment A is better for BOTH mild AND severe patients — but overall, Treatment B appears better. Why? Treatment B is used more on mild (easier) cases. The group sizes create a misleading combined total. A hospital administrator relying only on the "83% vs. 78%" would choose the worse treatment!
Teaching the paradox: Work through Example 1 on the board step by step. Ask: "Who is better in the strong student group? Who is better in the struggling group? Now look at the combined number — what happened?" Let students sit with the confusion before explaining. The confusion IS the lesson.
Discussion Questions + Teacher Notes
- "Is cherry-picking always intentional?"
→ No — and this is important. Confirmation bias means we often cherry-pick unconsciously. We notice data that agrees with us and overlook data that doesn't. Scientists use peer review and pre-registration to combat this.
- "What is survivorship bias — can you think of a real example?"
→ Classic example: "Successful entrepreneurs all dropped out of college" — we only see the famous successes, not the thousands who dropped out and failed. This is why "success stories" as advice can be dangerous.
- "In the Simpson's Paradox hospital example — which treatment would you choose for a family member? Why?"
→ Treatment A — because it's better for BOTH mild AND severe cases. The combined statistic is misleading due to different group sizes. This should feel unsettling — it means you MUST look at subgroups, not just totals.
- "What's the difference between a framing effect and a lie?"
→ Framing uses true numbers but selects presentation to create a desired impression. "90% fat free!" vs. "10% fat." Both true — very different impressions. Not a lie, but deliberately misleading.
Data Detectives — 4 Case Studies
Groups of 3–4. Each group gets all 4 cases. 15 min to identify the deception; then class share-out. Emphasize: the data is technically accurate — the conclusion drawn is wrong.
- Case 1 — Cherry-Picking: A company shows sales figures for only the 3 best months of the year and claims "We're growing!" The other 9 months all showed decline. Flaw: selected favorable subset only.
- Case 2 — Survivorship Bias: "All the most successful athletes train 6 hours a day — so you should too!" Missing: the thousands who trained 6 hours/day and still didn't succeed. Flaw: only studying the outcomes that survived/succeeded.
- Case 3 — Framing Effect: Drug A: "20% of patients experienced side effects." Drug B: "80% of patients had NO side effects." Same drug, same data — different frames. Flaw: same statistic, opposite emotional impact.
- Case 4 — Simpson's Paradox: School claims "Our overall reading scores improved from 60% to 65%." But scores for both advanced AND struggling readers dropped individually. The improvement came from a change in the mix of students. Flaw: combined result masks what happened to each group.
Debrief question: "Which trick was hardest to spot? Why do you think that is?"
Opening Hook
Show two bar graphs side by side — identical data, but one has a Y-axis starting at 95%, the other starting at 0%.
"Which graph makes the difference look bigger? Are both graphs accurate?"
→ Both are technically accurate — but the truncated axis makes a tiny difference look dramatic. This is one of the most common tricks in journalism and advertising.
Debrief Writing Prompt
Write on board: "Which deception technique was hardest for you to spot, and why? What question would a Data Detective ask to catch it?" 6 min writing. Students should name a specific technique and propose a specific "detective question" that exposes it.
Strong response: "Simpson's Paradox was hardest because the combined number looked real. A detective question would be: 'Are these groups the same size? What happens when you look at each group separately?'"
ND-Friendly Tips
- Simple examples first — Use school performance or sports stats before complex medical/political examples. Familiar contexts reduce cognitive load.
- Validate the confusion — Say explicitly: "Simpson's Paradox confuses trained statisticians. If it's hard for you, that's because it IS hard." This prevents shutdown.
- Frame as empowerment — "You now know this trick exists. Most adults don't. That makes you a better thinker." Not: "Data is always trying to deceive you."
- Case study cards — Physical cards give students something to hold, annotate, and refer back to. Better than a projected slide for extended work time.
- Allow pair work throughout — The detective activity is designed for groups. Don't require solo work during the case analysis phase.