1 / 15
Session 8 — Capstone Research Project | Grade 5 Data Science
Session 8 of 8 — Capstone

You Are a Data Scientist

Today you run your own research project — start to finish.

Ask a question. Plan your study. Collect real data.
Analyze it. Share your findings. That's science.

Data Science for Young Minds · Grade 5 · Ages 10–11

Looking Back

8 Sessions. 8 Skills.

S1 — Data Types

Primary vs secondary, quantitative vs qualitative, structured vs unstructured

S2 — Hypotheses

Testable questions, variables, predict before you collect

S3 — Sampling

Random, convenience, stratified; sample size matters; Literary Digest failure

S4 — Deception

Truncated axes, cherry-picking, survivorship bias, Simpson's Paradox

S5 — Probability

Experimental vs theoretical, Law of Large Numbers, gambler's fallacy

S6 — Correlation

Correlation ≠ causation, confounding variables, spurious correlations

S7 — Ethics

Privacy, consent, transparency, algorithmic bias, accountability

S8 — Capstone

Put it ALL together in your own research project

The Data Science Process

5 Stages — You'll Do All of Them Today

1
ASK
Form a testable research question and hypothesis
2
PLAN
Decide what data to collect, how, and from whom
3
COLLECT
Record real data carefully and systematically
4
ANALYZE
Calculate, graph, find patterns, interpret
5
SHARE
Write a conclusion and present your findings

Every study ever published by a scientist followed these exact five steps.

Stage 1

Ask — Form Your Research Question

A good research question is:

  • Specific — not "Do people like things?" but "What percent of our class prefers...?"
  • Testable — you can actually collect data to answer it
  • Feasible — you can do it in this class, today
  • Interesting — you actually want to know the answer

Hypothesis formula:

"I predict that ___ because ___."

Your hypothesis is your educated guess — based on what you already know.

It's OK if your data doesn't support it. That's still science!

Examples: "What percent of our class gets 8+ hours of sleep on school nights?" / "Is there a relationship between favorite subject and hours of screen time?" / "How do classmates rate their mood on a scale of 1–5 today, and does it correlate with how much sleep they got?"

Stage 2

Plan — Design Your Study

What data?

What exactly will you measure or ask? Be specific. "How many hours of sleep" is better than "sleep."

How to collect?

  • Survey your classmates
  • Observe and count
  • Conduct a simple experiment
  • Use existing data

Who is your sample?

Probably: this class. What are the limitations of using classmates as your sample? (Think: Session 3!)

Don't forget from Sessions 3–7:

  • What could make your sample biased?
  • How will you avoid leading questions in a survey?
  • Is your data collection method ethical? Did you get consent?
Stage 3

Collect — Record Your Data

Good data collection habits:

  • Record data as you collect it — don't rely on memory
  • Fill in every cell in your table
  • If data is missing, write "N/A" — don't skip or guess
  • Don't change your question mid-collection
  • Record exactly what you observe — don't adjust to match your hypothesis

Your data table should have:

  • A clear column header for each variable
  • One row per participant or observation
  • Units where needed (e.g., "hours", "km", "1–5 scale")
  • At least 10 data points for meaningful analysis

Aim for at least 10 data points. Remember: Law of Large Numbers — more data = more reliable results.

Stage 4

Analyze — Find the Story in Your Data

Step 1: Calculate

Find totals for each category. Calculate percentages:

% = count ÷ total × 100

Step 2: Graph

  • Bar chart — great for categories
  • Scatter plot — great for two numeric variables
  • Label axes, title your graph, include units

Step 3: Interpret

  • What pattern do you see?
  • Does it support your hypothesis?
  • Any surprises?
  • Correlation or causation?

Important: Your hypothesis can be "not supported" — that is still valid science. What matters is that your conclusion is based on your actual data.

Stage 4 — Continued

Write Your Conclusion Using C-E-R

C
CLAIM

State whether your hypothesis was supported or not supported. Be direct.

"My hypothesis was [supported / not supported] because..."

E
EVIDENCE

Cite specific numbers from your data. No vague statements.

"In my data, ___% of participants... The graph shows..."

R
REASONING

Explain how your evidence connects to your claim. Address limitations.

"This suggests... However, my sample was only... so..."

Stage 5

Share — Communicate Your Findings

Real scientists don't just collect data — they share it so others can learn from it, check it, and build on it. That's how science grows.

Mini-presentation tips (30 sec):

  • Start with your question
  • State what you found (one key number)
  • Say whether your hypothesis was supported
  • Name one limitation of your study

What to look for in others' presentations:

  • Did they cite actual data (not just "most people")?
  • Did they distinguish correlation vs causation?
  • Did they mention limitations honestly?
  • What question would you ask them?

Even a small, imperfect study with honest analysis is more valuable than a confident claim with no data.

Reflection

Every Good Study Acknowledges Its Limits

Common limitations to consider:

  • Small sample size — classmates aren't representative of everyone
  • Convenience sample — you surveyed who was available, not randomly
  • Self-report bias — people may not answer honestly
  • Confounding variables — something else could explain your result
  • One point in time — results might be different on another day

Ethics check for your study:

  • Did participants know what the data was for?
  • Was participation voluntary?
  • Did you keep responses anonymous?
  • Would you be comfortable if someone did this study on you?
  • Will you report what you actually found — even if it doesn't match your hypothesis?
Big Picture

This Is What Real Data Scientists Do

In medicine:

Clinical trials follow the exact 5-stage process — ask, plan, collect, analyze, share. Every drug you've ever taken went through this.

In climate science:

Climate scientists collect data from thousands of weather stations, analyze patterns over decades, and share findings with the world.

In social science:

Researchers survey people, study patterns in society, and use data to inform policies that affect millions of lives.

The tools are bigger and the datasets are larger — but the process is identical to what you're doing today. You are doing real science.

Data science is not about computers. It's about asking good questions, collecting honest evidence, and reasoning carefully.

Critical Thinking

The 5 Questions You'll Ask Forever

1

How do they know that?

What data was collected? How? By whom?

2

How big was the sample — and was it representative?

10 people? 10,000? A biased group?

3

Are they confusing correlation with causation?

Could a confounding variable explain this?

4

Is the graph showing the full picture?

Truncated Y-axis? Cherry-picked time range?

5

Who collected this data and why?

Does the collector have an interest in a particular result?

🧠 Quick Pair Share

Turn to a partner. You have 60 seconds each. Answer this question:

"What is one thing you learned in this course that changed how you look at data in real life?"

Give one example — a news story, a statistic you've seen, an app you use, anything.

Course Complete

You Are Now a Critical Data Thinker

You can spot misleading graphs
Truncated axes, cherry-picked data, misleading scales
You understand sampling
Sample size matters; who is and isn't in a sample matters
You know probability
Experimental vs theoretical; the gambler's fallacy is wrong
You separate correlation and causation
One confounding variable can explain everything
You think about data ethics
Privacy, consent, bias, accountability — all matter
You can run a research project
Ask → Plan → Collect → Analyze → Share

Every time you see a statistic, a graph, or a headline — you have the tools to ask better questions.

🎓

Congratulations, Data Scientists!

You completed all 8 sessions of Data Science for Young Minds.

You asked questions. You collected data. You found patterns.
You thought about who could be helped or harmed by data.
You built skills that will matter for the rest of your lives.

Final challenge: Stay curious. Keep asking "How do they know that?"