Learn Without Walls
← Back to Phase 3
Phase 3 — Python for Data
Module 7 of 14

Python for Data — pandas in Google Colab

Write Python in your browser. Zero installation. Real data analysis in minutes.

~22 minutes
📌 Before You Start

What you need: Open colab.research.google.com in a new tab. Sign in with your Google account. No installation required. Python runs entirely in your browser.

What you’ll do: Create a new notebook, import pandas, build a DataFrame (a table of data), filter rows, group and sum by category, and get instant statistics. These four operations cover 80% of daily analyst work in Python.

💡 The Concept

Python is the most popular programming language for data analysis. pandas is the Python library that makes working with data easy — think of it as a supercharged spreadsheet inside Python.

Why Python instead of just Sheets?

SQL → Python comparison (you already know SQL from Phase 1 — this will feel familiar):

SQL

Filter: WHERE sales > 400
Group: GROUP BY region
Sort: ORDER BY sales DESC

Python/pandas

Filter: df[df['Sales'] > 400]
Group: df.groupby('Region')
Sort: df.sort_values('Sales', ascending=False)

Different syntax. Same logic. Your SQL brain will help you learn pandas faster than you think.

🔗 Why It Matters

Python + pandas appears on nearly every data analyst and data scientist job posting. Even basic Python knowledge puts you ahead of 60% of applicants who only know spreadsheets.

Google Colab means you can practice Python from any computer with a browser — no setup, no installation, no IT department. You can show your work in a Colab notebook that anyone can view online. That is another portfolio piece.

🛠️ Tool Setup — Google Colab
1
Go to colab.research.google.com and sign in with your Google account.
2
Click “New Notebook” in the top left (or File → New Notebook). A blank notebook appears.
3
You will see a cell (a gray box). That is where you type Python code. Click the ▶ play button on the left of the cell to run it, or press Shift + Enter.
4
To add a new cell: click the “+ Code” button at the top, or hover below any cell and click the code button that appears. Each cell runs independently.
🖐️ Practice

Switch to your Colab notebook. Type each block of code into a new cell and run it with Shift+Enter. Do not copy-paste — typing it yourself builds the skill.

1
In your first cell, type and run:
Cell 1: Import pandas
import pandas as pd
No output = success. This loads the pandas library. You only need to do this once per notebook.
2
New cell. Create your first DataFrame:
Cell 2: Create a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Carol', 'David'],
    'Sales': [500, 300, 750, 200],
    'Region': ['North', 'South', 'North', 'East']
}
df = pd.DataFrame(data)
print(df)
You just created a DataFrame — pandas’ version of a table. See how it prints with row numbers, column headers, and values? That is your data, organized.
3
New cell. Filter rows (this is your SQL WHERE clause):
Cell 3: Filter rows
print(df[df['Sales'] > 400])
This shows only rows where Sales is greater than 400. Alice (500) and Carol (750) appear. Bob (300) and David (200) are filtered out. Compare this to WHERE Sales > 400 in SQL.
4
New cell. Group and sum (this is your SQL GROUP BY):
Cell 4: Group by Region
print(df.groupby('Region')['Sales'].sum())
Total sales by region. North = Alice + Carol = 500 + 750 = 1250. South = 300. East = 200. One line of code. This is what GROUP BY SUM() does in SQL.
5
New cell. Get instant statistics:
Cell 5: Describe the data
print(df.describe())
One line produces: count, mean, standard deviation, min, 25th percentile, median, 75th percentile, max. This is descriptive statistics on demand. In Sheets, this would take a dozen separate formulas.
6
New cell. Select a single column:
Cell 6: Select one column
print(df['Name'])
This prints just the Name column. Compare: in SQL this is SELECT Name FROM table. Same concept, Python syntax.
🛑 Good stopping point. Module 8 uses Power BI — come back when ready.
🧠 Brain Break

You just wrote Python. Real Python. Hands that have never written a line of Python just wrote six working programs. Take your hands off the keyboard. Shake them out gently. Breathe.

Shake out your hands Roll your wrists Look away from the screen 3 slow breaths

Take at least 2 full minutes. Your brain is processing new syntax patterns. It needs this.

✅ You Got This

The ONE thing to remember from this module:

pandas = spreadsheet superpowers in Python. Import data. Filter it. Summarize it. Three skills that cover 80% of daily analyst work.

What comes next: Module 8 introduces Power BI — Microsoft’s data visualization tool. If you know Tableau (Phase 2), Power BI will feel familiar. Knowing both makes you more versatile.

← Module 6: Dashboards 📋 Course Home Module 8: Power BI →