Module 1 Study Guide

Introduction to Statistics & Data

Free Statistics Learning Platform • Safaa Dabagh

1. What is Statistics?

Statistics: The science of collecting, organizing, analyzing, interpreting, and presenting data to help make informed decisions based on evidence.

Two Branches of Statistics

Descriptive Statistics

Purpose: Summarize and describe data you've already collected

What it does: Uses numbers, graphs, and tables to paint a picture of your data

Example: "The average test score in our class was 82%"

Think of it as: "What happened?"

Inferential Statistics

Purpose: Make predictions or generalizations about a larger population based on a sample

What it does: Uses probability and mathematical techniques to draw conclusions beyond your data

Example: "Based on polling 1,000 voters, we predict 52% of all voters will support Candidate A"

Think of it as: "What can we conclude or predict?"

CRITICAL CONCEPT: Correlation ≠ Causation
Just because two things are related doesn't mean one causes the other!

2. Types of Data

Main Categories

Quantitative (Numerical) Qualitative (Categorical)
Data measured with numbers
Can do math operations
Data that describes characteristics
Categories/labels
Examples: Height, age, weight, temperature, test scores Examples: Eye color, gender, major, zip code, car type

Quantitative Data Subtypes

Discrete Continuous
Only specific values (usually whole numbers)
You count it
Any value in a range (including decimals)
You measure it
Examples: Number of students (25, 26, 27), dice rolls, cars in lot Examples: Height (5'8.2"), weight (150.3 lbs), temperature (72.5°F)

Levels of Measurement

Level Description Examples
Nominal Categories with no order Eye color, zip codes, student ID
Ordinal Categories with meaningful order, but unequal gaps Class rank (1st, 2nd, 3rd), satisfaction (low, medium, high)
Interval Equal intervals, NO true zero Temperature (°F, °C), IQ scores, calendar years
Ratio Equal intervals + true zero Height, weight, age, income (zero = none)

3. Data Collection Methods

Three Main Methods

Method Description Can Establish Causation?
Observational Study Observe without interfering No - only association
Survey Ask people questions No - only association
Experiment Manipulate variables with random assignment YES - can show causation!
Only EXPERIMENTS with random assignment can establish cause-and-effect relationships!

Sampling Methods

Method Description Quality
Simple Random Every member has equal chance of selection Good
Stratified Divide into groups, randomly sample from each Good
Cluster Randomly select entire groups Good
Systematic Select every kth member Usually good
Convenience Sample whoever is easy to reach BIASED
Voluntary Response People self-select to participate VERY BIASED

Key Terms

Population: The entire group you want to study
Sample: A subset of the population that you actually collect data from
Parameter: A numerical summary of the population (usually unknown)
Statistic: A numerical summary of the sample (what we calculate)

4. Data Visualization

Choosing the Right Graph

Graph Type Data Type Purpose
Bar Chart Categorical Compare categories (bars have gaps)
Histogram Continuous numerical Show distribution (bars touch!)
Pie Chart Categorical (percentages) Show parts of whole (adds to 100%)
Line Graph Time series Show trends over time
Scatterplot Two quantitative variables Show relationship between variables
Bar Chart vs. Histogram:
• Bar charts: Bars have GAPS (categorical data)
• Histograms: Bars TOUCH (continuous data)

Misleading Graph Techniques (Watch Out!)

  1. Truncated Y-Axis: Starting bar chart above zero exaggerates differences
  2. Cherry-Picking Time Periods: Showing only part of data to tell specific story
  3. 3D Effects: Distorting proportions with unnecessary visual effects
  4. Inappropriate Graph Type: Using wrong graph makes comparisons impossible
  5. Dual Y-Axes Manipulation: Scaling axes to create false correlations

Best Practices

Quick Reference: Key Formulas & Concepts

Mean (Average)

Mean = (Sum of all values) ÷ (Number of values)

Important Reminders

1. Correlation ≠ Causation
2. Only experiments can establish causation
3. Convenience and voluntary response samples are biased
4. Bar chart bars have gaps; histogram bars touch
5. Always question: "How was this data collected?"

Module 1: Introduction to Statistics & Data

Free Statistics Learning Platform • safaa dabagh • sdabagh.github.io

© 2025 • Part of UCLA Dissertation Research