Working with Data
Structuring and processing real-world data
📖 Concept Recap
Real data usually comes as a list of dictionaries — one dict per record:
[{"name": "Alice", "score": 92}, {"name": "Bob", "score": 78}, ...]
- Filtering:
[r for r in records if r["field"] == value] - Sorting:
sorted(records, key=lambda r: r["field"], reverse=True) - Aggregating:
sum(r["salary"] for r in records) - Grouping: loop and build a dict of lists
- Finding max/min record:
max(records, key=lambda r: r["value"])
This is the foundation of data analysis — and exactly what pandas does under the hood.
👀 Worked Example
Analyzing an employee dataset — grouping and summarizing:
Exercise 1 — Data Filter and Sorter
Complete the data operations by filling in the blanks. Each blank is a dict key name (as a string).
"major", "gpa", "gpa". Use the exact spelling from the data.Exercise 2 — Multi-Field Analysis
Using the students list from Exercise 1, write code to find:
- The highest GPA in each major
- How many students are in each year (1, 2, 3, 4)
- The honor roll students (GPA ≥ 3.7) — print their names
max() on each group. For year counts, build a dict like year_counts = {} and use .get(year, 0) + 1.Exercise 3 — Generic Pivot Function
Write a function pivot_by_field(records, field) that groups any list of dicts by any given field. It should return a dict of lists. Test it on the students list by grouping by both "major" and "year".
result = {}. Loop through records. Use result.setdefault(record[field], []).append(record) to build the groups without checking if the key exists yet.Mini Project — Sales Report Processor
Analyze the sales data below. Write a complete analysis that answers all 5 questions and prints a formatted report:
- Total revenue across all records
- Revenue by region
- Top 3 performing sales reps (total revenue each)
- Month with highest total sales
- Reps who exceeded quota (quota = $15,000 total)
sorted(..., key=lambda x: x[1], reverse=True)[:3].✅ Lab 7 Complete!
You can now structure, filter, sort, group, and aggregate real-world data — all without any external libraries. This is exactly what pandas automates. Now you’re ready for it.
Continue to Lab 8: pandas Basics →