Lesson 7.3: Searching and Replacing
By the end of this lesson, you will be able to:
- Find substrings using find() and index()
- Search from the right using rfind() and rindex()
- Replace substrings with replace()
- Check for substring existence with the
inoperator - Understand the basics of pattern matching
The in Operator
The simplest way to check if a substring exists within a string is the in operator.
sentence = "Python is a powerful programming language" print("Python" in sentence) print("java" in sentence) print("powerful" in sentence) # Using 'not in' print("java" not in sentence)
in operator is case-sensitive. "python" in "Python is great" returns False. Convert both to lowercase for case-insensitive checks.
text = "Python is Great" search = "python" # Case-insensitive search if search.lower() in text.lower(): print("Found it!")
The find() Method
find() returns the index of the first occurrence of a substring. If not found, it returns -1 (instead of raising an error).
text = "Hello, World! Hello, Python!" print(text.find("Hello")) # First occurrence print(text.find("World")) # Position of "World" print(text.find("Java")) # Not found
Searching from a Starting Position
You can specify where to start searching.
text = "Hello, World! Hello, Python!" # Find second "Hello" by starting after the first first = text.find("Hello") second = text.find("Hello", first + 1) print(f"First 'Hello' at index: {first}") print(f"Second 'Hello' at index: {second}")
Using find() Safely
email = "user@example.com" at_pos = email.find("@") if at_pos != -1: username = email[:at_pos] domain = email[at_pos + 1:] print(f"Username: {username}") print(f"Domain: {domain}") else: print("Invalid email")
find() vs index()
find() returns -1 when not found. index() raises a ValueError when not found. Use find() when missing values are expected; use index() when missing values are errors.
text = "Python programming" # find() returns -1 print(text.find("Java")) # index() raises ValueError # text.index("Java") # ValueError: substring not found # Both work the same when substring IS found print(text.find("prog")) print(text.index("prog"))
Searching from the Right: rfind() and rindex()
rfind() and rindex() search from the right (end) of the string and return the index of the last occurrence.
path = "/home/user/documents/report.pdf" # Find last "/" to get the filename last_slash = path.rfind("/") filename = path[last_slash + 1:] print(f"Filename: {filename}") # Find last "." to get the extension last_dot = path.rfind(".") extension = path[last_dot + 1:] print(f"Extension: {extension}")
Finding All Occurrences
text = "the cat sat on the mat near the hat" search = "the" positions = [] start = 0 while True: pos = text.find(search, start) if pos == -1: break positions.append(pos) start = pos + 1 print(f"'{search}' found at positions: {positions}")
Advanced replace()
Building on what you learned in Lesson 1, here are more advanced uses of replace().
# Censoring words message = "The password is secret123" censored = message.replace("secret123", "*****") print(censored) # Normalizing whitespace messy = "too many spaces" while " " in messy: messy = messy.replace(" ", " ") print(messy) # Converting separators csv_data = "name,age,city" tab_data = csv_data.replace(",", "\t") print(tab_data)
Try It Yourself
Given url = "https://www.example.com/path/to/page", use find() and slicing to extract just the domain name ("www.example.com").
Introduction to Pattern Matching
While find() and replace() work great for exact matches, sometimes you need to search for patterns rather than exact text. Python has a module called re (regular expressions) for this, which you will explore in later modules.
For now, here are some common pattern-checking techniques using the methods you already know:
# Check if a string contains only digits pin = "1234" print(f"Is PIN all digits? {pin.isdigit()}") # Check if input looks like an email email = "user@example.com" has_at = "@" in email has_dot = "." in email print(f"Looks like email? {has_at and has_dot}")
Check Your Understanding
- What does
"hello world".find("world")return? - What does
"hello world".find("python")return? - What is the difference between
find()andindex()? - How does
rfind()differ fromfind()?
6(the index where "world" begins)-1(not found)find()returns -1 when not found;index()raises a ValueErrorrfind()searches from the right and returns the last occurrence
Key Takeaways
- Use
infor quick existence checks:"word" in text find()returns the index of first occurrence, or -1 if not foundindex()raises ValueError if not found -- use when absence is an errorrfind()searches from the right -- useful for finding the last occurrencereplace()returns a new string with substitutions applied- All string searching is case-sensitive by default -- use
.lower()for case-insensitive searches