Learn Without Walls
← Back to Python Practice Labs
Lab 7 of 10

Working with Data

Structuring and processing real-world data

← Lab 6: Strings Lab 7 of 10 Lab 8: pandas →
⏳ Loading Python... (first load takes ~10 seconds)

📖 Concept Recap

Real data usually comes as a list of dictionaries — one dict per record:

[{"name": "Alice", "score": 92}, {"name": "Bob", "score": 78}, ...]

This is the foundation of data analysis — and exactly what pandas does under the hood.

👀 Worked Example

Analyzing an employee dataset — grouping and summarizing:

from collections import defaultdict employees = [ {"name": "Alice Chen", "dept": "Engineering", "salary": 95000, "years": 5}, {"name": "Bob Smith", "dept": "Marketing", "salary": 72000, "years": 3}, {"name": "Carol Wu", "dept": "Engineering", "salary": 105000, "years": 8}, {"name": "David Park", "dept": "HR", "salary": 68000, "years": 2}, {"name": "Eve Johnson", "dept": "Marketing", "salary": 78000, "years": 6}, ] # Group by department by_dept = defaultdict(list) for emp in employees: by_dept[emp["dept"]].append(emp) # Average salary by department for dept, emps in by_dept.items(): avg = sum(e["salary"] for e in emps) / len(emps) print(f"{dept}: ${avg:,.0f} avg salary") # Top earner top = max(employees, key=lambda e: e["salary"]) print(f"\nTop earner: {top['name']} (${top['salary']:,})")
✏️ Guided

Exercise 1 — Data Filter and Sorter

Complete the data operations by filling in the blanks. Each blank is a dict key name (as a string).

Output will appear here...
💡 Hint: The blanks are dict key names: "major", "gpa", "gpa". Use the exact spelling from the data.
💪 Independent

Exercise 2 — Multi-Field Analysis

Using the students list from Exercise 1, write code to find:

  1. The highest GPA in each major
  2. How many students are in each year (1, 2, 3, 4)
  3. The honor roll students (GPA ≥ 3.7) — print their names
Output will appear here...
💡 Hint: For highest GPA per major, group students by major first, then use max() on each group. For year counts, build a dict like year_counts = {} and use .get(year, 0) + 1.
🔥 Challenge

Exercise 3 — Generic Pivot Function

Write a function pivot_by_field(records, field) that groups any list of dicts by any given field. It should return a dict of lists. Test it on the students list by grouping by both "major" and "year".

Output will appear here...
💡 Hint: Start with result = {}. Loop through records. Use result.setdefault(record[field], []).append(record) to build the groups without checking if the key exists yet.
🏆 Mini Project

Mini Project — Sales Report Processor

Analyze the sales data below. Write a complete analysis that answers all 5 questions and prints a formatted report:

  1. Total revenue across all records
  2. Revenue by region
  3. Top 3 performing sales reps (total revenue each)
  4. Month with highest total sales
  5. Reps who exceeded quota (quota = $15,000 total)
Output will appear here...
💡 Hint: Build separate dicts for rep_revenue, region_revenue, and month_revenue as you loop. Sort rep_revenue by value and take the top 3 with sorted(..., key=lambda x: x[1], reverse=True)[:3].

✅ Lab 7 Complete!

You can now structure, filter, sort, group, and aggregate real-world data — all without any external libraries. This is exactly what pandas automates. Now you’re ready for it.

Continue to Lab 8: pandas Basics →

← Lab 6: Strings Lab 7 of 10 Lab 8: pandas →