Learn Without Walls
← Module 5: Collecting Data ⌂ Course Home Module 7: Research Ethics →
Research Methods — Module 06

Analyzing Your Data

Analysis is where data starts to tell a story — the goal is to find patterns and answer your research question

📌 Before You Start

Prerequisites: Modules 1–5. No statistical software needed for this module.

Estimated time: ~45 minutes including the mini-analysis exercise.

What you need: Pen and paper or a calculator. The small dataset in the Your Turn section.

By the end of this module you will be able to describe quantitative analysis basics, conduct a mini thematic analysis, and identify common analytical mistakes.

💡 The Big Idea

Analysis is where your data starts to tell a story. The goal is to find patterns, make meaning, and answer your research question — while being honest about uncertainty and the limits of what your data can actually show.

🔍 Deep Dive

Quantitative Analysis: Descriptive Statistics

Before any complex analysis, describe your data. Descriptive statistics summarize the basic features of your dataset.

MeasureWhat it tells youWhen to use
Mean The arithmetic average. Sensitive to outliers. Symmetric, roughly normal distributions. Income, test scores (without extreme outliers).
Median The middle value when sorted. Resistant to outliers. Skewed distributions. Better for income, housing prices, or anything with extreme values.
Mode The most frequent value. Categorical data. "What is the most common response?"
Standard Deviation (SD) How spread out the data is around the mean. Larger SD = more spread. Always report alongside the mean. A mean without an SD is incomplete.
Frequency / Percentage How often each value or category appears. Categorical data (yes/no, gender, major, response options).

Comparing Groups: Is the Difference Real?

If your research question compares groups (e.g., "Do students who tutor perform better than those who don't?"), you need to determine whether the difference you observe is real or just due to random chance.

Basic inferential statistics (concepts only — no formulas here):

TestWhen to use it
t-test Comparing means of two groups. Example: Do tutored students score higher than non-tutored students?
Chi-square (χ²) Comparing frequencies or proportions of categories. Example: Are women more likely than men to report financial stress?
Correlation Measuring the strength and direction of a relationship between two continuous variables. Example: Is there a relationship between study hours and GPA?
What is a p-value? A p-value tells you the probability of getting your results (or more extreme results) by chance alone, assuming the null hypothesis is true. A small p-value (typically < 0.05) suggests your result is unlikely to be due to chance. It does NOT prove your hypothesis is true — it just says the data is inconsistent with chance.

Visualizing Quantitative Data

The right chart depends on what you are trying to show:

Chart TypeBest for
Bar chartComparing categories (e.g., average GPA by major)
HistogramShowing the distribution of a continuous variable (e.g., distribution of study hours)
Scatter plotShowing the relationship between two continuous variables (e.g., sleep vs. GPA)
Pie / Donut chartShowing proportions of a whole (use sparingly — bar charts are often clearer)

Qualitative Analysis: Thematic Analysis

Thematic analysis is the most common qualitative method for analyzing interviews, open-ended survey responses, or documents. It involves identifying patterns (themes) in text.

1
Read and re-read. Immerse yourself in the data. Read through all your transcripts or responses at least twice before analyzing.
2
Code. A code is a label you assign to a segment of text that captures what it is about. Example: a participant says "I skipped meals to afford textbooks" — you might code this as "financial sacrifice" or "food insecurity."
3
Find themes. Group related codes together. A theme is a broader pattern that captures something meaningful about the data. Multiple codes may combine into one theme.
4
Review and refine. Check that each theme makes sense, is clearly distinct from others, and is supported by multiple data examples.
5
Interpret. Write up what each theme means in relation to your research question. Use direct quotes as evidence.
What is saturation? In qualitative research, you have collected enough data when you reach saturation — the point at which new interviews or responses are no longer producing new codes or themes. New data just confirms what you already found.

Mixed Methods: The Best of Both

Mixed methods combines quantitative and qualitative approaches in the same study. Common patterns:

Common Analytical Mistakes

Cherry-picking data: Only reporting findings that support your hypothesis, ignoring contradictory results. This is a form of research misconduct.
P-hacking: Running many statistical tests until you find a "significant" result, then only reporting that one. This inflates false positive rates dramatically.
Ignoring outliers: Unusual data points are information, not noise to discard. Always investigate outliers before removing them.
Confusing correlation and causation: Even a perfect correlation (r = 1.0) does not mean one variable caused the other. There may be a third variable (confound) causing both.

📋 Real Example: A Mini Thematic Analysis

Survey question: "What is the biggest challenge you face as a college student?"

Here are responses from 8 students (condensed). Codes are shown in brackets.

  1. "Balancing work and classes is exhausting. I work 30 hours a week." [work-life balance] [fatigue]
  2. "I never feel like I belong here. Everyone seems to already know what they're doing." [belonging] [imposter syndrome]
  3. "Money. Always money. I stress about rent every month." [financial stress]
  4. "I'm the first in my family to go to college. No one can help me navigate this." [first-generation] [lack of support]
  5. "Working nights means I miss office hours and study groups." [work-life balance] [isolation]
  6. "I have ADHD and the lecture format doesn't work for me." [disability] [learning environment]
  7. "I feel like I'm always behind financially. I can't afford the calculator we need for class." [financial stress]
  8. "Sometimes I think everyone else gets this except me." [imposter syndrome] [belonging]

Emerging themes:

Interpretation: Financial stress and imposter syndrome are the most common challenges. First-generation students appear particularly vulnerable to both. Note: With only 8 responses, these themes are preliminary — a larger dataset is needed before claiming saturation.

🖐️ Your Turn

What you need: A calculator or paper. About 15 minutes.

Here is a small dataset of 10 students. For each student, you have: study hours per day, hours of sleep per night, and GPA (on a 4.0 scale).

StudentStudy Hours/DaySleep Hours/NightGPA
1262.8
2473.4
3152.2
4583.8
5373.1
6663.6
7252.5
8483.5
9141.9
10573.7
  1. Calculate the mean for study hours, sleep hours, and GPA across all 10 students.
  2. Find the highest and lowest GPA in the dataset. Which students have them?
  3. Looking at the data, what pattern do you observe between study hours and GPA? Between sleep and GPA?
  4. Important caution: With only 10 students, can you conclude that studying more causes a higher GPA? What confounding variables might explain the pattern?

Mean GPA answer to check your work: 3.05

🧠 Brain Break — 2 Minutes

Think about a statistic you have seen recently.

A news headline. A product claim. A political argument. "X% of people believe..." or "Y is linked to Z..."

Ask: Is that a mean or median? What is the sample size? Could there be a confounding variable? Is the claim correlation, or are they implying causation? Now you have the tools to ask these questions every time.

✅ Key Takeaways

🎯 Module 6 Complete!

You can now make sense of data. In Module 7, you will learn the ethical rules that protect participants and the integrity of the research process.



Continue to Module 7: Research Ethics →