Learn Without Walls
← Back to Python Practice Labs
Lab 8 of 10

pandas Basics

The data analyst’s most essential library

← Lab 7: Data Lab 8 of 10 Lab 9: Visualization →
⏳ Loading Python + pandas... (this may take 15–20 seconds)
📦 This lab loads pandas and numpy — allow extra time on first load. You’ll see the green “Python ready!” message when it’s done.

📖 Concept Recap

pandas is the go-to Python library for data analysis. Key concepts:

👀 Worked Example

Creating and exploring a DataFrame:

import pandas as pd import numpy as np data = { 'Name': ['Alice', 'Bob', 'Carol', 'David', 'Eve'], 'Department': ['Eng', 'Marketing', 'Eng', 'HR', 'Marketing'], 'Salary': [95000, 72000, 105000, 68000, 78000], 'Years': [5, 3, 8, 2, 6] } df = pd.DataFrame(data) print(df) print("\nShape:", df.shape) print("\nBasic stats:") print(df['Salary'].describe()) print("\nAvg salary by dept:") print(df.groupby('Department')['Salary'].mean())
✏️ Guided

Exercise 1 — Student DataFrame Explorer

Complete the pandas operations by filling in the column name strings in the blanks.

Output will appear here...
💡 Hint: The blanks are column name strings: 'Major', 'Major', 'GPA', 'GPA'. In pandas, column names are always quoted strings.
💪 Independent

Exercise 2 — DataFrame Analysis

Using the same students DataFrame, write pandas code to:

  1. Find the student with the highest GPA (use .idxmax() or .sort_values())
  2. Count students per year (use .value_counts())
  3. Find the average GPA of scholarship vs non-scholarship students (use .groupby())
  4. Sort by GPA descending and show the top 3 students
Output will appear here...
💡 Hint: Highest GPA: students.loc[students['GPA'].idxmax()]. Year counts: students['Year'].value_counts(). Top 3: students.sort_values('GPA', ascending=False).head(3).
🔥 Challenge

Exercise 3 — GPA Categories with pd.cut()

Create a new column 'GPA_Category' using pd.cut() that labels each student’s GPA:

Then count how many students fall into each category.

Output will appear here...
💡 Hint: bins is the list [0, 3.0, 3.3, 3.7, 4.01] (slightly above 4 to include 4.0). labels is the list of 4 category strings in matching order.
🏆 Mini Project

Mini Project — Sales Dataset Explorer

Analyze the sales DataFrame below. Answer all 5 business questions using pandas and print a formatted report:

  1. Total revenue and total quota across all rows
  2. Top salesperson by total revenue
  3. Best month by total revenue
  4. Revenue by region (sorted highest to lowest)
  5. Percentage of rows where rep exceeded their quota (Revenue > Quota)
Output will appear here...
💡 Hint: Q1: sales['Revenue'].sum(). Q2: sales.groupby('Rep')['Revenue'].sum().idxmax(). Q5: sales['Hit_Quota'] = sales['Revenue'] > sales['Quota'] then .mean() * 100.

✅ Lab 8 Complete!

You’ve created DataFrames, filtered rows, grouped data, and answered real business questions with pandas. This is the core skill of a data analyst.

Continue to Lab 9: Data Visualization →

← Lab 7: Data Lab 8 of 10 Lab 9: Visualization →