Teacher Cheat Sheet — Session 4: When Data Deceives

Data Science for Young Minds · Grade 5 · Ages 10–11

~60 min Ages 10–11 Session 4 of 8 ND-Friendly

Session Agenda

Time	Block	What's Happening
0–5	Hook	Show a truncated Y-axis graph. "What does this tell us?" vs. the same graph with Y-axis starting at 0.
5–20	Lesson 1–2	Cherry-picking · Survivorship bias · Framing effects — each with a real-world example
20–35	Lesson 3	Simpson's Paradox — two worked examples with small numbers; think step-by-step
35–52	Activity	Data Detectives: 4 case studies — identify the deception technique, find the flaw, reframe honestly
52–58	Debrief	Students write: "Which deception technique was hardest to spot? Why?"
58–60	Close	Preview S5: "Even honest data has randomness in it — next time we learn about probability."

Tone note: Frame this session as empowerment, not cynicism. "You now have the tools to catch this" rather than "data is always lying to you." These tricks fool even trained adults — validate the difficulty.

Materials Needed

Printed Case Study cards (4 cases — see below) Worksheets (1 per student) PencilsHook graph printed or projected (truncated vs. full Y-axis)

Tip: Print case study cards double-sided so groups can reference the "flaw explanation" after they've attempted to identify it themselves.

Key Vocabulary

Cherry-picking — selecting only the data that supports your conclusion, ignoring the rest

Survivorship bias — only studying the "survivors" of a process, missing those who didn't make it

Framing effect — presenting the same data differently to create different impressions

Simpson's paradox — a trend appears in separate groups but reverses when groups are combined

Confirmation bias — the tendency to seek data that confirms what you already believe

Simpson's Paradox — Two Worked Examples for Instructors

Example 1: School Improvement (simpler)

Group	School A pass rate	School B pass rate
Strong students	90% (90/100)	85% (17/20)
Struggling students	30% (6/20)	20% (20/100)
Overall	80% (96/120)	31% (37/120)

School A is better in BOTH groups — yet the headline "School B has 31% pass rate vs. School A's 80%" seems accurate. The paradox: School A has far more struggling students in its mix. When you combine groups of very different sizes, results flip.

Example 2: Hospital Treatment Success

Patient type	Treatment A success	Treatment B success
Mild cases	81% (81/100)	87% (234/270)
Severe cases	73% (192/263)	69% (55/80)
Combined	78% (273/363)	83% (289/350)

Treatment A is better for BOTH mild AND severe patients — but overall, Treatment B appears better. Why? Treatment B is used more on mild (easier) cases. The group sizes create a misleading combined total. A hospital administrator relying only on the "83% vs. 78%" would choose the worse treatment!

Teaching the paradox: Work through Example 1 on the board step by step. Ask: "Who is better in the strong student group? Who is better in the struggling group? Now look at the combined number — what happened?" Let students sit with the confusion before explaining. The confusion IS the lesson.

Discussion Questions + Teacher Notes

"Is cherry-picking always intentional?"
→ No — and this is important. Confirmation bias means we often cherry-pick unconsciously. We notice data that agrees with us and overlook data that doesn't. Scientists use peer review and pre-registration to combat this.
"What is survivorship bias — can you think of a real example?"
→ Classic example: "Successful entrepreneurs all dropped out of college" — we only see the famous successes, not the thousands who dropped out and failed. This is why "success stories" as advice can be dangerous.
"In the Simpson's Paradox hospital example — which treatment would you choose for a family member? Why?"
→ Treatment A — because it's better for BOTH mild AND severe cases. The combined statistic is misleading due to different group sizes. This should feel unsettling — it means you MUST look at subgroups, not just totals.
"What's the difference between a framing effect and a lie?"
→ Framing uses true numbers but selects presentation to create a desired impression. "90% fat free!" vs. "10% fat." Both true — very different impressions. Not a lie, but deliberately misleading.

Data Detectives — 4 Case Studies

Groups of 3–4. Each group gets all 4 cases. 15 min to identify the deception; then class share-out. Emphasize: the data is technically accurate — the conclusion drawn is wrong.

Case 1 — Cherry-Picking: A company shows sales figures for only the 3 best months of the year and claims "We're growing!" The other 9 months all showed decline. Flaw: selected favorable subset only.
Case 2 — Survivorship Bias: "All the most successful athletes train 6 hours a day — so you should too!" Missing: the thousands who trained 6 hours/day and still didn't succeed. Flaw: only studying the outcomes that survived/succeeded.
Case 3 — Framing Effect: Drug A: "20% of patients experienced side effects." Drug B: "80% of patients had NO side effects." Same drug, same data — different frames. Flaw: same statistic, opposite emotional impact.
Case 4 — Simpson's Paradox: School claims "Our overall reading scores improved from 60% to 65%." But scores for both advanced AND struggling readers dropped individually. The improvement came from a change in the mix of students. Flaw: combined result masks what happened to each group.

Debrief question: "Which trick was hardest to spot? Why do you think that is?"

Opening Hook

Show two bar graphs side by side — identical data, but one has a Y-axis starting at 95%, the other starting at 0%.

"Which graph makes the difference look bigger? Are both graphs accurate?"

→ Both are technically accurate — but the truncated axis makes a tiny difference look dramatic. This is one of the most common tricks in journalism and advertising.

Debrief Writing Prompt

Write on board:
"Which deception technique was hardest for you to spot, and why? What question would a Data Detective ask to catch it?"

6 min writing. Students should name a specific technique and propose a specific "detective question" that exposes it.

Strong response: "Simpson's Paradox was hardest because the combined number looked real. A detective question would be: 'Are these groups the same size? What happens when you look at each group separately?'"

ND-Friendly Tips

Simple examples first — Use school performance or sports stats before complex medical/political examples. Familiar contexts reduce cognitive load.
Validate the confusion — Say explicitly: "Simpson's Paradox confuses trained statisticians. If it's hard for you, that's because it IS hard." This prevents shutdown.
Frame as empowerment — "You now know this trick exists. Most adults don't. That makes you a better thinker." Not: "Data is always trying to deceive you."
Case study cards — Physical cards give students something to hold, annotate, and refer back to. Better than a projected slide for extended work time.
Allow pair work throughout — The detective activity is designed for groups. Don't require solo work during the case analysis phase.