Your First Classifier — K-Nearest Neighbors
Find the closest examples, take a vote, make a prediction
📚 This module uses scikit-learn. The first time you run a code block, it will install scikit-learn via micropip (~5–10 seconds). Be patient — subsequent runs are instant!
📌 Before You Start
- Modules 1–3 completed (especially understanding train/test split)
- Patience for the first scikit-learn load (~10–20 seconds total)
Estimated time: ~55 minutes
What you’ll learn: How K-Nearest Neighbors works, how to use scikit-learn’s API (fit → predict → score), and how the choice of k affects results.
💡 The Big Idea
KNN asks: “What do the k most similar examples in my training set look like?” Then it takes a majority vote among those neighbors.
No complicated math. No training phase. When you call .predict() on a new flower, KNN just finds the 3 (or 5, or k) closest flowers in the training set and says “it’s probably the same species as most of them.”
It’s like asking your k nearest neighbors what they think — and going with the majority.
Simple, intuitive, and surprisingly effective for many problems. And it’s a perfect starting point for understanding the scikit-learn API that all ML algorithms share.
🧠 How It Works
KNN Step by Step
Why k Matters
- k = 1: Follows training data perfectly. Very sensitive to noise. Often overfit.
- k = 3 or 5: Good balance for many datasets. Filters noise while staying responsive.
- k = 50+: Smooths over everything. May miss local patterns (underfit).
There’s no universal “best k.” You find it by experimenting — exactly what your exercise will do.
The sklearn API (same for every algorithm!)
This 5-step pattern works for every sklearn classifier: decision trees, random forests, SVMs. Learn it once, use it everywhere.
▶️ See It In Code
A complete KNN pipeline on the Iris dataset. This will install scikit-learn — first run takes 10–20 seconds.
👋 Your Turn
Run the code below and find out which value of k gives the best accuracy on the Iris test set. Try k = 1, 5, and 10. Record the results and explain what you observe.
k_values like [1, 3, 5, 7, 10, 15, 20]. Notice how accuracy changes. Which tends to be better — very small k or very large k?☕ Brain Break — 2 Minutes
You’re new to a city and trying to decide where to eat. You ask your 3 nearest neighbors:
- Neighbor 1: “Go to the pizza place on 5th!”
- Neighbor 2: “Pizza on 5th is great!”
- Neighbor 3: “Try the sushi place instead.”
Majority vote: 2 pizza vs 1 sushi → you go for pizza. That’s exactly KNN.
Now imagine asking 100 neighbors. Some live far away and don’t even know the pizza place exists. Their votes might not be helpful. This is why k matters — too many neighbors can drown out the signal.
The right number of “neighbors” to consult is almost always somewhere in the middle.
✅ Key Takeaways
- KNN classifies by finding the k most similar training examples and taking a majority vote. No training phase — it just memorizes.
- The standard sklearn API pattern is: import → create → fit → predict → score. This works for every algorithm.
- k too small (k=1): overfits, sensitive to noise. k too large: underfits, ignores local patterns. The sweet spot is usually between 3 and 15.
- Always set a random_state in train_test_split to get reproducible results.
- KNN requires feature scaling (see Module 3) because it’s purely distance-based. Unscaled features will bias the results.
🎉 Module 4 Complete!
You’ve trained your first real ML model! Next, we’ll explore a completely different approach — one that makes decisions by asking yes/no questions.