Practice Problems

1Sleep and Test Performance

A researcher wants to test if students who get 8+ hours of sleep perform better on exams than students who get less than 8 hours. She randomly selects students and records:

8+ hours group: n₁ = 35, x̄₁ = 82, s₁ = 9
< 8 hours group: n₂ = 40, x̄₂ = 76, s₂ = 11

Test at α = 0.05. Does sleep improve test performance?

Solution:

Step 1: Hypotheses

H₀: μ₁ = μ₂ (no difference)
Hₐ: μ₁ > μ₂ (8+ hours group scores higher) — Right-tailed test

Step 2: Check conditions

Independent random samples
Both n₁ ≥ 30 and n₂ ≥ 30 (CLT applies)

Step 3: Test statistic (unpooled)

t = (82 - 76) / √(9²/35 + 11²/40) = 6 / √(2.314 + 3.025) = 6 / 2.310 ≈ 2.60

Step 4: Degrees of freedom

Using Welch's approximation: df ≈ 72 (use technology)

Step 5: p-value

For t = 2.60, df = 72, right-tailed: p-value ≈ 0.006

Step 6: Decision

Since p-value (0.006) < α (0.05), reject H₀.

Conclusion: There is sufficient evidence to conclude that students who get 8+ hours of sleep score significantly higher on exams.

2Urban vs Rural Income

Is there a difference in average household income between urban and rural areas?

Urban: n₁ = 50, x̄₁ = $68,000, s₁ = $15,000
Rural: n₂ = 45, x̄₂ = $62,000, s₂ = $12,000

Test at α = 0.01 (two-tailed).

Solution:

Hypotheses: H₀: μ₁ = μ₂, Hₐ: μ₁ ≠ μ₂ (two-tailed)

Test statistic:

t = (68000 - 62000) / √(15000²/50 + 12000²/45) = 6000 / √(4500000 + 3200000) = 6000 / 2774.89 ≈ 2.16

Critical value: For α = 0.01 (two-tailed), df ≈ 88: t* ≈ ±2.63

Decision: Since |2.16| < 2.63, fail to reject H₀.

Conclusion: At the 0.01 significance level, there is insufficient evidence to conclude that average household incomes differ between urban and rural areas. The $6,000 difference could be due to sampling variability.

3Teaching Methods Comparison

Two teaching methods are compared. Method A is used with 25 students, Method B with 28 students. Assume equal variances.

Method A: n₁ = 25, x̄₁ = 88, s₁ = 7
Method B: n₂ = 28, x̄₂ = 84, s₂ = 8

Use pooled variance approach. Test at α = 0.05 if Method A is better.

Solution:

Hypotheses: H₀: μ₁ = μ₂, Hₐ: μ₁ > μ₂ (right-tailed)

Pooled variance:

sp² = [(25-1)(7²) + (28-1)(8²)] / (25+28-2) = [1176 + 1728] / 51 = 56.94

sp = 7.55

Test statistic:

t = (88 - 84) / (7.55√(1/25 + 1/28)) = 4 / (7.55 × 0.275) = 4 / 2.076 ≈ 1.93

Degrees of freedom: df = 25 + 28 - 2 = 51

Critical value: For α = 0.05 (right-tailed), df = 51: t* ≈ 1.675

Decision: Since 1.93 > 1.675, reject H₀.

Conclusion: There is sufficient evidence that Method A produces significantly higher scores than Method B.

4Drug Side Effects

Two drugs are compared for a side effect (headache duration in hours).

Drug X: n₁ = 30, x̄₁ = 4.2 hours, s₁ = 1.5
Drug Y: n₂ = 35, x̄₂ = 3.8 hours, s₂ = 1.8

Is there a significant difference at α = 0.10?

Solution:

Hypotheses: H₀: μ₁ = μ₂, Hₐ: μ₁ ≠ μ₂ (two-tailed)

Test statistic:

t = (4.2 - 3.8) / √(1.5²/30 + 1.8²/35) = 0.4 / √(0.075 + 0.0926) = 0.4 / 0.410 ≈ 0.98

p-value: For t = 0.98, df ≈ 62, two-tailed: p-value ≈ 0.33

Decision: Since p-value (0.33) > α (0.10), fail to reject H₀.

Conclusion: There is no significant difference in headache duration between the two drugs.

5Confidence Interval for Difference

Using the data from Problem 1 (8+ hours: n₁=35, x̄₁=82, s₁=9; <8 hours: n₂=40, x̄₂=76, s₂=11), construct a 95% confidence interval for μ₁ - μ₂.

Solution:

Formula: (x̄₁ - x̄₂) ± t* × √(s₁²/n₁ + s₂²/n₂)

Standard error:

SE = √(81/35 + 121/40) = √(2.314 + 3.025) = 2.310

Critical value: For 95% CI, df ≈ 72: t* ≈ 1.993

Confidence interval:

(82 - 76) ± 1.993 × 2.310 = 6 ± 4.604 = (1.396, 10.604)

Interpretation: We are 95% confident that students who get 8+ hours of sleep score between 1.4 and 10.6 points higher on exams than students who get less sleep. Since the interval doesn't contain 0, there is a significant difference.

6Weight Loss Program

Six people participate in a weight loss program. Their weights (in pounds) are recorded before and after:

Person	Before	After	d = Before - After
1	180	175	5
2	195	188	7
3	210	205	5
4	165	162	3
5	188	180	8
6	202	196	6

Test at α = 0.05 if the program results in significant weight loss.

Solution:

Step 1: Calculate d̄ and sd

d̄ = (5+7+5+3+8+6) / 6 = 34/6 = 5.667 pounds

Squared deviations: (5-5.667)²=0.444, (7-5.667)²=1.778, etc.

sd = √[Σ(d-d̄)²/(n-1)] = √[13.333/5] = 1.633

Step 2: Hypotheses

H₀: μd = 0, Hₐ: μd > 0 (right-tailed, expect weight loss)

Step 3: Test statistic

t = (5.667 - 0) / (1.633/√6) = 5.667 / 0.667 ≈ 8.50

Step 4: Critical value

df = 6 - 1 = 5, α = 0.05 (right-tailed): t* ≈ 2.015

Step 5: Decision

Since 8.50 > 2.015, reject H₀. p-value < 0.001

Conclusion: The weight loss program results in significant weight loss (average 5.67 pounds).

7Blood Pressure Medication

A blood pressure medication is tested on 10 patients. Systolic BP is measured before and after treatment:

Mean difference (Before - After): d̄ = 8.5 mmHg
Standard deviation of differences: sd = 6.2 mmHg
n = 10 patients

Test at α = 0.01 if the medication reduces blood pressure.

Solution:

Hypotheses: H₀: μd = 0, Hₐ: μd > 0 (right-tailed)

Test statistic:

t = (8.5 - 0) / (6.2/√10) = 8.5 / 1.960 ≈ 4.34

Degrees of freedom: df = 10 - 1 = 9

Critical value: For α = 0.01 (right-tailed), df = 9: t* ≈ 2.821

Decision: Since 4.34 > 2.821, reject H₀.

Conclusion: At the 0.01 significance level, there is strong evidence that the medication significantly reduces blood pressure by an average of 8.5 mmHg.

8Tutoring Effectiveness

Twelve students take a pre-test before tutoring and a post-test after:

Mean difference (Post - Pre): d̄ = 12.5 points
Standard deviation: sd = 8.4 points

Construct a 95% confidence interval for the mean improvement.

Solution:

Formula: d̄ ± t* × (sd/√n)

Critical value: df = 12 - 1 = 11, 95% CI: t* ≈ 2.201

Margin of error:

ME = 2.201 × (8.4/√12) = 2.201 × 2.425 = 5.337

Confidence interval:

12.5 ± 5.337 = (7.16, 17.84) points

Interpretation: We are 95% confident that tutoring improves test scores by an average of 7.16 to 17.84 points. Since the interval doesn't contain 0, tutoring significantly improves scores.

9Reaction Time Study

Eight subjects' reaction times (in milliseconds) are measured on their dominant and non-dominant hands:

Subject	Dominant	Non-Dominant
1	285	310
2	290	305
3	275	295
4	300	320
5	280	300
6	295	315
7	270	285
8	288	308

Test if there's a significant difference at α = 0.05.

Solution:

Calculate differences (Dominant - Non-Dominant):

d: -25, -15, -20, -20, -20, -20, -15, -20

Statistics:

d̄ = -155/8 = -19.375 ms

sd ≈ 3.204 ms

Hypotheses: H₀: μd = 0, Hₐ: μd ≠ 0 (two-tailed)

Test statistic:

t = -19.375 / (3.204/√8) = -19.375 / 1.133 ≈ -17.10

Decision: df = 7, |t| = 17.10 >> critical value. Reject H₀ (p < 0.001).

Conclusion: There is overwhelming evidence that reaction time is significantly faster on the dominant hand (by about 19 ms on average).

10Identify the Design

For each scenario, state whether you should use a paired test or independent test:

Comparing anxiety levels before and after therapy for 20 patients
Comparing average salaries of teachers vs. nurses (different people)
Testing if identical twins differ in IQ (one twin raised in each environment)
Comparing recovery times for patients receiving Drug A vs. Drug B (random assignment)

Solution:

(a) Paired test - Same 20 patients measured twice (before/after)

(b) Independent test - Two separate groups (teachers vs. nurses)

(c) Paired test - Matched pairs design (twins matched)

(d) Independent test - Two separate groups of patients

11Drug Cure Rates

Two drugs are compared:

Drug A: 85 out of 150 patients cured
Drug B: 92 out of 180 patients cured

Test at α = 0.05 if the cure rates differ.

Solution:

Sample proportions:

p̂₁ = 85/150 = 0.567, p̂₂ = 92/180 = 0.511

Check conditions:

n₁p̂₁ = 85 ≥ 10, n₁(1-p̂₁) = 65 ≥ 10

n₂p̂₂ = 92 ≥ 10, n₂(1-p̂₂) = 88 ≥ 10

Hypotheses: H₀: p₁ = p₂, Hₐ: p₁ ≠ p₂ (two-tailed)

Pooled proportion:

p̄ = (85+92)/(150+180) = 177/330 = 0.536

Test statistic:

z = (0.567-0.511) / √[0.536×0.464×(1/150+1/180)]

z = 0.056 / √[0.249×0.01111] = 0.056 / 0.0527 ≈ 1.06

Decision: For α = 0.05 (two-tailed), z* = ±1.96. Since |1.06| < 1.96, fail to reject H₀.

Conclusion: There is insufficient evidence to conclude the cure rates differ between the two drugs.

12Gender and Policy Support

A survey asks about support for a new policy:

Men: 240 out of 400 support (60%)
Women: 300 out of 450 support (66.7%)

Is there a significant difference at α = 0.01?

Solution:

Hypotheses: H₀: p₁ = p₂, Hₐ: p₁ ≠ p₂

Pooled proportion:

p̄ = (240+300)/(400+450) = 540/850 = 0.635

Test statistic:

z = (0.60-0.667) / √[0.635×0.365×(1/400+1/450)]

z = -0.067 / 0.0331 ≈ -2.02

Critical value: α = 0.01 (two-tailed): z* = ±2.576

Decision: Since |-2.02| < 2.576, fail to reject H₀.

Conclusion: At the 0.01 level, there is insufficient evidence to conclude men and women differ in their support for the policy.

13Online vs In-Person Pass Rates

Compare pass rates for online vs. in-person classes:

Online: 78 out of 120 students pass
In-person: 95 out of 130 students pass

Construct a 95% confidence interval for the difference in pass rates.

Solution:

Sample proportions:

p̂₁ = 78/120 = 0.65, p̂₂ = 95/130 = 0.731

Formula (NO pooling for CI):

(p̂₁ - p̂₂) ± z* × √[(p̂₁(1-p̂₁)/n₁) + (p̂₂(1-p̂₂)/n₂)]

Standard error:

SE = √[(0.65×0.35/120) + (0.731×0.269/130)]

SE = √[0.001896 + 0.001512] = 0.0584

95% CI: z* = 1.96

(0.65 - 0.731) ± 1.96 × 0.0584

-0.081 ± 0.114 = (-0.195, 0.033)

Interpretation: We're 95% confident the difference in pass rates is between -19.5% and +3.3%. Since the interval contains 0, there's no significant difference.

14Quality Control Comparison

Two factories' defect rates are compared:

Factory 1: 18 defective out of 250 items (7.2%)
Factory 2: 25 defective out of 300 items (8.3%)

Test at α = 0.10 if Factory 2 has a higher defect rate (one-tailed).

Solution:

Hypotheses: H₀: p₂ = p₁, Hₐ: p₂ > p₁ (right-tailed)

Pooled proportion:

p̄ = (18+25)/(250+300) = 43/550 = 0.0782

Test statistic:

z = (0.083-0.072) / √[0.0782×0.9218×(1/250+1/300)]

z = 0.011 / 0.0229 ≈ 0.48

Critical value: α = 0.10 (right-tailed): z* = 1.28

Decision: Since 0.48 < 1.28, fail to reject H₀.

Conclusion: There is insufficient evidence that Factory 2 has a higher defect rate.

15Checking Conditions

Can you conduct a two-proportion z-test for these scenarios?

n₁ = 50 with 8 successes, n₂ = 60 with 45 successes
n₁ = 150 with 120 successes, n₂ = 200 with 30 successes
n₁ = 100 with 55 successes, n₂ = 90 with 40 successes

Solution:

(a) NO - n₁p̂₁ = 8 < 10. Fails success-failure condition.

(b) NO - n₂p̂₂ = 30 and n₂(1-p̂₂) = 170, but check n₁: p̂₁ = 0.8, so n₁(1-p̂₁) = 30 ≥ 10. Actually this one works! Both conditions met.

Correction (b) YES - All conditions satisfied.

(c) YES - n₁p̂₁ = 55 ≥ 10, n₁(1-p̂₁) = 45 ≥ 10, n₂p̂₂ = 40 ≥ 10, n₂(1-p̂₂) = 50 ≥ 10. All conditions met.

16Test Selection Practice

For each scenario, identify which hypothesis test to use:

A researcher compares average commute times in City A (n=50) vs. City B (n=60).
A company claims 90% customer satisfaction. You survey 200 customers to test this.
Nurses measure patients' pain levels before and after a treatment (same 30 patients).
Compare proportion of voters supporting Candidate X in Texas vs. California.

Solution:

(a) Independent two-sample t-test - Two separate cities (independent), testing means (commute times)

(b) One-sample z-test for proportion - One sample, testing proportion against claimed 90%

(c) Paired t-test - Same 30 patients measured twice (before/after), testing means (pain levels)

(d) Two-sample z-test for proportions - Two states (independent), testing proportions (voter support)

17Independent or Paired?

Determine if each scenario requires independent or paired test:

Compare average test scores of 40 students using Method A vs. 35 different students using Method B
Measure blood sugar levels in 25 diabetic patients before and after a diet change
Compare average heights of 50 adult men vs. 50 adult women
Test reading comprehension in 20 children at age 5 and again at age 7

Solution:

(a) Independent - Different students in each group

(b) Paired - Same 25 patients measured twice

(c) Independent - Different people (men vs. women)

(d) Paired - Same 20 children measured twice (at different ages)

18Complete Analysis

A fitness instructor wants to test if a new workout program improves mile run times. She records times for 15 participants before and after the 8-week program:

Mean difference (Before - After): d̄ = 1.2 minutes
Standard deviation: sd = 0.8 minutes

a) Which test should be used?
b) Test at α = 0.05 if the program improves times.
c) Construct a 90% confidence interval.

Solution:

(a) Paired t-test - Same 15 participants measured twice

(b) Hypothesis test:

H₀: μd = 0, Hₐ: μd > 0 (improvement means positive difference)

t = 1.2 / (0.8/√15) = 1.2 / 0.2066 = 5.81

df = 14, critical value ≈ 1.761. Since 5.81 > 1.761, reject H₀.

Conclusion: The program significantly improves run times (p < 0.001).

(c) 90% CI: t* ≈ 1.761 for df = 14

1.2 ± 1.761 × 0.2066 = 1.2 ± 0.364 = (0.836, 1.564) minutes

We're 90% confident the program improves times by 0.84 to 1.56 minutes on average.

19Mixed Practice

Identify the test AND the hypotheses for each:

Test if more than 70% of college students have part-time jobs (sample: 250 students)
Compare average GPAs of athletes vs. non-athletes at a university
Test if a meditation app reduces stress scores (measure same 40 people before/after)

Solution:

(a) One-sample z-test for proportion

H₀: p = 0.70, Hₐ: p > 0.70 (right-tailed)

(b) Independent two-sample t-test

H₀: μ₁ = μ₂, Hₐ: μ₁ ≠ μ₂ (two-tailed, unless direction specified)

(c) Paired t-test

H₀: μd = 0, Hₐ: μd > 0 (right-tailed, if d = Before - After, expecting reduction)

20Critical Thinking Challenge

A researcher wants to test if a new teaching method improves test scores. She has two options:

Design A: Randomly assign 50 students to new method, 50 to traditional method. Compare final exam scores.

Design B: Give all 50 students a pre-test, teach using new method, then give post-test. Compare before/after scores.

a) Which test would be used for each design?
b) Which design is more powerful (better at detecting real effects)?
c) What are the tradeoffs?

Solution:

(a) Tests:

Design A: Independent two-sample t-test (two separate groups)
Design B: Paired t-test (same students, before/after)

(b) More Powerful: Design B (paired)

Paired designs control for individual variability. Each student serves as their own control, eliminating noise from differing baseline abilities.

(c) Tradeoffs:

Design A Advantages:

Can directly compare two methods simultaneously
No practice effects from taking test twice

Design A Disadvantages:

Needs more subjects for same power
Individual differences add noise

Design B Advantages:

More powerful (controls individual variability)
Needs fewer subjects

Design B Disadvantages:

No comparison group (can't isolate method effect from practice effect)
Students might improve just from test practice
Can't tell if new method is better than traditional

Best approach: Use Design A if you want to compare two methods. Use Design B only if you also have a control group taking the same pre/post tests with traditional method!

Practice Problems

About These Practice Problems

1Sleep and Test Performance

Solution:

2Urban vs Rural Income

Solution:

3Teaching Methods Comparison

Solution:

4Drug Side Effects

Solution:

5Confidence Interval for Difference

Solution:

6Weight Loss Program

Solution:

7Blood Pressure Medication

Solution:

8Tutoring Effectiveness

Solution:

9Reaction Time Study

Solution:

10Identify the Design

Solution:

11Drug Cure Rates

Solution:

12Gender and Policy Support

Solution:

13Online vs In-Person Pass Rates

Solution:

14Quality Control Comparison

Solution:

15Checking Conditions

Solution:

16Test Selection Practice

Solution:

17Independent or Paired?

Solution:

18Complete Analysis

Solution:

19Mixed Practice

Solution:

20Critical Thinking Challenge

Solution: