Learn Without Walls
← Lesson 3 Lesson 4 of 4 Practice →

Lesson 7.4: Common String Patterns

By the end of this lesson, you will be able to:

Splitting Strings

split(): Breaks a string into a list of substrings based on a separator. By default, it splits on whitespace.
# Split on whitespace (default)
sentence = "Python is a great language"
words = sentence.split()
print(words)
print(f"Word count: {len(words)}")
['Python', 'is', 'a', 'great', 'language'] Word count: 5

Splitting on a Specific Separator

# Split CSV data
csv_line = "Alice,25,Engineer,New York"
fields = csv_line.split(",")
print(fields)

# Split a URL path
url = "https://example.com/blog/2024/post"
parts = url.split("/")
print(parts)

# Split on any string
data = "red::green::blue::yellow"
colors = data.split("::")
print(colors)
['Alice', '25', 'Engineer', 'New York'] ['https:', '', 'example.com', 'blog', '2024', 'post'] ['red', 'green', 'blue', 'yellow']

Limiting Splits

# Split only on the first occurrence
assignment = "name = John Smith"
key, value = assignment.split(" = ", 1)
print(f"Key: {key}")
print(f"Value: {value}")
Key: name Value: John Smith

Joining Strings

join(): The opposite of split(). Combines a list of strings into a single string with a separator between each element. Syntax: separator.join(list)
words = ["Python", "is", "awesome"]

# Join with spaces
sentence = " ".join(words)
print(sentence)

# Join with commas
csv = ",".join(words)
print(csv)

# Join with no separator
letters = ["P", "y", "t", "h", "o", "n"]
word = "".join(letters)
print(word)

# Join with newlines
lines = ["Line 1", "Line 2", "Line 3"]
text = "\n".join(lines)
print(text)
Python is awesome Python,is,awesome Python Line 1 Line 2 Line 3

split() and join() Together

# Normalize spacing in a sentence
messy = "  too   many    spaces   here  "
clean = " ".join(messy.split())
print(f"'{clean}'")

# Convert between separators
csv_data = "Alice,Bob,Charlie"
tab_data = "\t".join(csv_data.split(","))
print(tab_data)
'too many spaces here' Alice Bob Charlie

Parsing Structured Data

Combining split(), slicing, and other methods lets you extract information from structured text.

Parsing a Log Line

log = "2024-03-15 14:30:22 ERROR Database connection failed"

parts = log.split(" ", 3)   # Split into 4 parts max
date = parts[0]
time = parts[1]
level = parts[2]
message = parts[3]

print(f"Date: {date}")
print(f"Time: {time}")
print(f"Level: {level}")
print(f"Message: {message}")
Date: 2024-03-15 Time: 14:30:22 Level: ERROR Message: Database connection failed

Parsing CSV Data

csv_text = "Alice,25,Engineer\nBob,30,Designer\nCharlie,28,Teacher"

for line in csv_text.split("\n"):
    name, age, job = line.split(",")
    print(f"{name} is {age} years old and works as a {job}")
Alice is 25 years old and works as a Engineer Bob is 30 years old and works as a Designer Charlie is 28 years old and works as a Teacher

Building Strings

There are several ways to build strings from data in Python.

String Concatenation (+ operator)

first = "Hello"
last = "World"
result = first + ", " + last + "!"
print(result)
Hello, World!

f-strings (Recommended)

name = "Alice"
age = 25
print(f"{name} is {age} years old")
print(f"In 5 years, {name} will be {age + 5}")
Alice is 25 years old In 5 years, Alice will be 30

Building with join() (Most Efficient for Many Items)

# More efficient than concatenation in a loop
parts = []
for i in range(5):
    parts.append(f"item{i}")
result = ", ".join(parts)
print(result)
item0, item1, item2, item3, item4

Try It Yourself

Given data = "John:Doe:35:Engineer:NYC", split it by ":" and create a formatted sentence with the information.

Multiline Strings

Use triple quotes (""" or ''') to create strings that span multiple lines.

poem = """Roses are red,
Violets are blue,
Python is great,
And so are you!"""
print(poem)
Roses are red, Violets are blue, Python is great, And so are you!

splitlines()

The splitlines() method splits a multiline string into a list of lines.

text = """Line one
Line two
Line three"""

lines = text.splitlines()
print(lines)
print(f"Number of lines: {len(lines)}")
['Line one', 'Line two', 'Line three'] Number of lines: 3

Raw Strings

Raw String: A string prefixed with r that treats backslashes as literal characters instead of escape characters. Useful for file paths and regular expressions.
# Regular string: \n is a newline
print("Hello\nWorld")

# Raw string: \n is literally \n
print(r"Hello\nWorld")

# Useful for Windows file paths
path = r"C:\Users\name\documents\file.txt"
print(path)
Hello World Hello\nWorld C:\Users\name\documents\file.txt

Common Escape Characters

  • \n -- newline
  • \t -- tab
  • \\ -- literal backslash
  • \" -- literal double quote inside double-quoted string
  • \' -- literal single quote inside single-quoted string

Check Your Understanding

  1. What does "a,b,c".split(",") return?
  2. What does "-".join(["2024", "03", "15"]) return?
  3. How do you split a string into individual lines?
  4. What does the r prefix do before a string?
  1. ["a", "b", "c"]
  2. "2024-03-15"
  3. Use .splitlines() or .split("\n")
  4. It makes the string "raw" -- backslashes are treated as literal characters, not escape sequences

Key Takeaways