โฑ Session Agenda
| Time | Block | What's Happening |
| 0โ5 | ๐ฏ Hook | Show a truncated Y-axis graph. "What does this tell us?" vs. the same graph with Y-axis starting at 0. |
| 5โ20 | ๐ Lesson 1โ2 | Cherry-picking ยท Survivorship bias ยท Framing effects โ each with a real-world example |
| 20โ35 | ๐ Lesson 3 | Simpson's Paradox โ two worked examples with small numbers; think step-by-step |
| 35โ52 | ๐ต๏ธ Activity | Data Detectives: 4 case studies โ identify the deception technique, find the flaw, reframe honestly |
| 52โ58 | โ๏ธ Debrief | Students write: "Which deception technique was hardest to spot? Why?" |
| 58โ60 | ๐ Close | Preview S5: "Even honest data has randomness in it โ next time we learn about probability." |
Tone note: Frame this session as empowerment, not cynicism. "You now have the tools to catch this" rather than "data is always lying to you." These tricks fool even trained adults โ validate the difficulty.
๐ฆ Materials Needed
Printed Case Study cards (4 cases โ see below)
Worksheets (1 per student)
Pencils
Hook graph printed or projected (truncated vs. full Y-axis)
๐ก Tip: Print case study cards double-sided so groups can reference the "flaw explanation" after they've attempted to identify it themselves.
๐ Key Vocabulary
Cherry-picking โ selecting only the data that supports your conclusion, ignoring the rest
Survivorship bias โ only studying the "survivors" of a process, missing those who didn't make it
Framing effect โ presenting the same data differently to create different impressions
Simpson's paradox โ a trend appears in separate groups but reverses when groups are combined
Confirmation bias โ the tendency to seek data that confirms what you already believe
๐ฎ Simpson's Paradox โ Two Worked Examples for Instructors
Example 1: School Improvement (simpler)
| Group | School A pass rate | School B pass rate |
| Strong students | 90% (90/100) | 85% (17/20) |
| Struggling students | 30% (6/20) | 20% (20/100) |
| Overall | 80% (96/120) | 31% (37/120) |
School A is better in BOTH groups โ yet the headline "School B has 31% pass rate vs. School A's 80%" seems accurate. The paradox: School A has far more struggling students in its mix. When you combine groups of very different sizes, results flip.
Example 2: Hospital Treatment Success
| Patient type | Treatment A success | Treatment B success |
| Mild cases | 81% (81/100) | 87% (234/270) |
| Severe cases | 73% (192/263) | 69% (55/80) |
| Combined | 78% (273/363) | 83% (289/350) |
Treatment A is better for BOTH mild AND severe patients โ but overall, Treatment B appears better. Why? Treatment B is used more on mild (easier) cases. The group sizes create a misleading combined total. A hospital administrator relying only on the "83% vs. 78%" would choose the worse treatment!
Teaching the paradox: Work through Example 1 on the board step by step. Ask: "Who is better in the strong student group? Who is better in the struggling group? Now look at the combined number โ what happened?" Let students sit with the confusion before explaining. The confusion IS the lesson.
๐ฌ Discussion Questions + Teacher Notes
- "Is cherry-picking always intentional?"
โ No โ and this is important. Confirmation bias means we often cherry-pick unconsciously. We notice data that agrees with us and overlook data that doesn't. Scientists use peer review and pre-registration to combat this.
- "What is survivorship bias โ can you think of a real example?"
โ Classic example: "Successful entrepreneurs all dropped out of college" โ we only see the famous successes, not the thousands who dropped out and failed. This is why "success stories" as advice can be dangerous.
- "In the Simpson's Paradox hospital example โ which treatment would you choose for a family member? Why?"
โ Treatment A โ because it's better for BOTH mild AND severe cases. The combined statistic is misleading due to different group sizes. This should feel unsettling โ it means you MUST look at subgroups, not just totals.
- "What's the difference between a framing effect and a lie?"
โ Framing uses true numbers but selects presentation to create a desired impression. "90% fat free!" vs. "10% fat." Both true โ very different impressions. Not a lie, but deliberately misleading.
๐ต๏ธ Data Detectives โ 4 Case Studies
Groups of 3โ4. Each group gets all 4 cases. 15 min to identify the deception; then class share-out. Emphasize: the data is technically accurate โ the conclusion drawn is wrong.
- Case 1 โ Cherry-Picking: A company shows sales figures for only the 3 best months of the year and claims "We're growing!" The other 9 months all showed decline. Flaw: selected favorable subset only.
- Case 2 โ Survivorship Bias: "All the most successful athletes train 6 hours a day โ so you should too!" Missing: the thousands who trained 6 hours/day and still didn't succeed. Flaw: only studying the outcomes that survived/succeeded.
- Case 3 โ Framing Effect: Drug A: "20% of patients experienced side effects." Drug B: "80% of patients had NO side effects." Same drug, same data โ different frames. Flaw: same statistic, opposite emotional impact.
- Case 4 โ Simpson's Paradox: School claims "Our overall reading scores improved from 60% to 65%." But scores for both advanced AND struggling readers dropped individually. The improvement came from a change in the mix of students. Flaw: combined result masks what happened to each group.
Debrief question: "Which trick was hardest to spot? Why do you think that is?"
๐ฏ Opening Hook
Show two bar graphs side by side โ identical data, but one has a Y-axis starting at 95%, the other starting at 0%.
"Which graph makes the difference look bigger? Are both graphs accurate?"
โ Both are technically accurate โ but the truncated axis makes a tiny difference look dramatic. This is one of the most common tricks in journalism and advertising.
โ๏ธ Debrief Writing Prompt
Write on board:
"Which deception technique was hardest for you to spot, and why? What question would a Data Detective ask to catch it?"
6 min writing. Students should name a specific technique and propose a specific "detective question" that exposes it.
Strong response: "Simpson's Paradox was hardest because the combined number looked real. A detective question would be: 'Are these groups the same size? What happens when you look at each group separately?'"
๐ง ND-Friendly Tips
- Simple examples first โ Use school performance or sports stats before complex medical/political examples. Familiar contexts reduce cognitive load.
- Validate the confusion โ Say explicitly: "Simpson's Paradox confuses trained statisticians. If it's hard for you, that's because it IS hard." This prevents shutdown.
- Frame as empowerment โ "You now know this trick exists. Most adults don't. That makes you a better thinker." Not: "Data is always trying to deceive you."
- Case study cards โ Physical cards give students something to hold, annotate, and refer back to. Better than a projected slide for extended work time.
- Allow pair work throughout โ The detective activity is designed for groups. Don't require solo work during the case analysis phase.