Learn Without Walls
← Back to R Practice Labs
Lab 10 of 10 — Capstone

Full R Analysis

Everything together — a complete real analysis

← Lab 9: Reporting Lab 10 of 10 Lab Home →
⏳ Loading R... (first load takes ~15 seconds)

🎓 Capstone Project

This is not a fill-in-the-blanks exercise. This is a real mini analysis project using everything you have learned across all 10 labs.

Scenario: You are a university research analyst examining student outcomes data. The registrar has given you a dataset of 100 students with GPA, study habits, demographics, and enrollment details. Your job is to answer six research questions and write an executive summary.

There are no pre-written answers here — write your own R code from scratch. A collapsed sample solution is provided at the bottom for reference after you try.

📊 Dataset

Step 1 — Generate the Dataset

Run this code first to create the students data frame. It will be available for all subsequent questions in the same session.

Output will appear here...

📋 Six Research Questions

Answer all six questions below using your own R code. Each question has its own editor.

  1. Population Description: How many students per major? What is the year distribution? What percentage are first-generation students? What percentage hold scholarships?
  2. GPA Distribution: What are the overall summary statistics for GPA? Break down mean GPA by major using tapply(). Which major has the highest average GPA?
  3. Study Hours & GPA Relationship: What is the Pearson correlation between study_hours_week and gpa? Fit a simple linear regression and interpret the slope.
  4. First-Generation Student Equity: Do first-generation students have significantly different GPAs from non-first-gen students? Run a t-test and state your conclusion.
  5. Multiple Regression: Predict GPA from study_hours_week + sleep_hours + first_gen + year. Which predictors are statistically significant?
  6. Executive Summary: Write an 8-sentence plain-English executive summary of your findings using cat(). Address: sample size, average GPA, top major, study-GPA correlation, first-gen equity finding, and recommendations.
Question 1

Population Description

Output will appear here...
Question 2

GPA Distribution

Output will appear here...
Question 3

Study Hours & GPA Relationship

Output will appear here...
Question 4

First-Generation Student Equity

Output will appear here...
Question 5

Multiple Regression

Output will appear here...
Question 6 — Executive Summary

Write Your Executive Summary

Write an 8-sentence plain-English executive summary using cat(). Address: sample size, average GPA, top major, study-GPA correlation, first-gen equity finding, multiple regression result, and at least one recommendation.

Output will appear here...
💡 Hint: Compute key statistics at the top of your script and save them as named variables. Then reference those variables inside cat(sprintf(...)) calls. This way your summary automatically updates if the data changes.

📖 Sample Solution

Try all six questions yourself before opening this. There is no single “right” answer — good R code is readable, well-commented, and gets the correct result.

Click to reveal sample solution approach
# Key techniques used across the 6 questions: # # Q1: table() for counts, mean() for proportions # sort(table(major), decreasing=TRUE) # # Q2: summary(gpa) for distribution # tapply(gpa, major, mean) for group means # # Q3: cor(x, y) for Pearson r # lm(gpa ~ study_hours_week) for regression # summary(model)$r.squared for R-squared # # Q4: Subset into two vectors by first_gen==TRUE/FALSE # t.test(fg_gpa, nfg_gpa) and check p.value < 0.05 # # Q5: lm(gpa ~ study_hours_week + sleep_hours + first_gen + year) # summary(model)$coefficients — Pr(>|t|) column for p-values # # Q6: Pre-compute all statistics, then write cat() sentences # Use sprintf() for inline number formatting # Let the data drive the text: if/else for significance statements

🏆 You’ve Completed All 10 R Practice Labs!

You have built a complete R toolkit: vectors, data frames, dplyr, functions, strings, ggplot2, statistics, tidyr, reporting, and a full end-to-end capstone analysis. These skills cover the core of professional data science in R.

← Back to R Practice Labs Home Explore the Full Data Science Course → Try the AI for Educators Course →
← Lab 9: Reporting Lab 10 of 10 Lab Home →