Descriptive Statistics Quick Reference

Module 2: Measures of Center, Spread, Shape & Boxplots

KEY PRINCIPLE: Use MEDIAN for skewed data or outliers • Use MEAN for symmetric data

Measures of Center

Mean: x̄ = Σx / n

Sum of all values ÷ number of values

Median: Middle value when ordered

If even n: average of two middle values

Mode: Most frequent value
Use...When...
MeanSymmetric, no outliers
MedianSkewed or has outliers
ModeCategorical data

Measures of Spread

Range: Max − Min
IQR: Q3 − Q1

Spread of middle 50% of data

Variance: s² = Σ(x − x̄)² / (n − 1)
Std Dev: s = √s²

Same units as data; typical distance from mean

Resistant to Outliers?YesNo
CenterMedian, ModeMean
SpreadIQRRange, SD

Empirical Rule (68-95-99.7)

For BELL-SHAPED distributions only!

~68% within x̄ ± 1s
~95% within x̄ ± 2s
~99.7% within x̄ ± 3s

Example: Mean = 100, SD = 15
• 68% between 85–115
• 95% between 70–130
• 99.7% between 55–145

Five-Number Summary

1. Min2. Q1 (25th percentile) • 3. Median (Q2, 50th) • 4. Q3 (75th) • 5. Max

To Find Quartiles:
  1. Order data, find median (Q2)
  2. Q1 = median of lower half
  3. Q3 = median of upper half

Distribution Shapes

Symmetric

Mean ≈ Median ≈ Mode
Mirror image sides
Ex: Heights, test scores

Right-Skewed (Positive)

Mode < Median < Mean
Long tail to right
Ex: Income, home prices

Left-Skewed (Negative)

Mean < Median < Mode
Long tail to left
Ex: Age at death

Mean pulled toward the tail!

Outlier Detection (1.5×IQR Rule)

Lower Fence = Q1 − 1.5×IQR
Upper Fence = Q3 + 1.5×IQR

Any value < Lower Fence OR > Upper Fence = OUTLIER

Example: Q1 = 20, Q3 = 40, IQR = 20
Lower: 20 − 30 = −10
Upper: 40 + 30 = 70
Outliers: < −10 or > 70

Boxplot Components

  • Box: Q1 to Q3 (middle 50%)
  • Line in box: Median (Q2)
  • Whiskers: Extend to min/max within 1.5×IQR
  • Dots: Outliers beyond whiskers
Reading Boxplot Shape:
ShapeBoxplot Appearance
SymmetricMedian centered, equal whiskers
Right-skewedMedian left, longer right whisker
Left-skewedMedian right, longer left whisker

Comparing Groups

Side-by-side boxplots compare:

  • Center: Which median is higher?
  • Spread: Which has wider box/whiskers?
  • Shape: Symmetric vs. skewed?
  • Outliers: Which has unusual values?

Decision Tree: Choosing Statistics

Use MEAN & SD when:

  • Distribution is symmetric
  • No extreme outliers
  • Data is bell-shaped

Use MEDIAN & IQR when:

  • Distribution is skewed
  • Outliers present
  • Data like income/prices

Key Reminders

Free Statistics Learning Platform • Safaa Dabagh • sdabagh.github.io • © 2025