Lesson 3: The Gram-Schmidt Process
Estimated time: 40-50 minutes
Learning Objectives
- Apply the Gram-Schmidt process to convert a basis into an orthogonal basis
- Normalize an orthogonal basis to produce an orthonormal basis
- Understand the connection to QR factorization
- Recognize why orthogonal bases are computationally superior
The Algorithm
Given a basis {x1, x2, ..., xk} for a subspace W, Gram-Schmidt produces an orthogonal basis {v1, v2, ..., vk} for the same subspace.
Gram-Schmidt Process:
v1 = x1
v2 = x2 - proj_{v1}(x2) = x2 - [(x2 · v1)/(v1 · v1)] v1
v3 = x3 - proj_{v1}(x3) - proj_{v2}(x3) = x3 - [(x3 · v1)/(v1 · v1)] v1 - [(x3 · v2)/(v2 · v2)] v2
Continue: each new vector = original minus projections onto all previous orthogonal vectors.
Key Idea
At each step, we subtract the components in the directions we have already built. What remains is the new orthogonal direction.
Gram-Schmidt in R^2
Worked Example 1
Basis: x1 = (1, 1), x2 = (0, 1). Apply Gram-Schmidt.
Step 1: v1 = x1 = (1, 1).
Step 2: v2 = x2 - [(x2 · v1)/(v1 · v1)] v1 = (0,1) - [(0+1)/2](1,1) = (0,1) - (1/2, 1/2) = (-1/2, 1/2).
Check: v1 · v2 = (1)(-1/2) + (1)(1/2) = 0. Orthogonal!
Orthonormal basis: e1 = v1/||v1|| = (1/sqrt(2), 1/sqrt(2)), e2 = v2/||v2|| = (-1/sqrt(2), 1/sqrt(2)).
Gram-Schmidt in R^3
Worked Example 2
Basis: x1 = (1, 1, 0), x2 = (1, 0, 1), x3 = (0, 1, 1).
Step 1: v1 = (1, 1, 0). v1 · v1 = 2.
Step 2: x2 · v1 = 1+0+0 = 1. v2 = (1,0,1) - (1/2)(1,1,0) = (1/2, -1/2, 1). v2 · v2 = 1/4 + 1/4 + 1 = 3/2.
Check: v1 · v2 = 1/2 - 1/2 + 0 = 0.
Step 3: x3 · v1 = 0+1+0 = 1. x3 · v2 = 0 - 1/2 + 1 = 1/2.
v3 = (0,1,1) - (1/2)(1,1,0) - (1/3)(1/2,-1/2,1) = (0,1,1) - (1/2,1/2,0) - (1/6,-1/6,1/3).
v3 = (0 - 1/2 - 1/6, 1 - 1/2 + 1/6, 1 - 0 - 1/3) = (-2/3, 2/3, 2/3).
Check: v1 · v3 = -2/3 + 2/3 + 0 = 0. v2 · v3 = -1/3 - 1/3 + 2/3 = 0.
Applying to a Subspace Basis
Worked Example 3
Find an orthogonal basis for W = span{(1, 0, 1, 0), (0, 1, 0, 1)}.
v1 = (1, 0, 1, 0). v1 · v1 = 2.
x2 · v1 = 0. So v2 = x2 - 0 = (0, 1, 0, 1). The original vectors were already orthogonal!
This will sometimes happen. When it does, Gram-Schmidt confirms it automatically.
QR Factorization
QR Factorization: If A is an m x n matrix with linearly independent columns, then A = QR where:
- Q is m x n with orthonormal columns (obtained by normalizing the Gram-Schmidt output)
- R is n x n upper triangular with positive diagonal entries
Worked Example 4
A = [1 0; 1 1; 0 1] (columns are x1 = (1,1,0), x2 = (0,1,1)).
Gram-Schmidt: v1 = (1,1,0), v2 = (0,1,1) - (1/2)(1,1,0) = (-1/2, 1/2, 1).
Normalize: q1 = (1/sqrt(2), 1/sqrt(2), 0), q2 = (-1/sqrt(6), 1/sqrt(6), 2/sqrt(6)).
Q = [q1 | q2], R = [||v1|| (x2·q1); 0 ||v2||] = [sqrt(2) 1/sqrt(2); 0 sqrt(3/2)].
Why QR Matters
QR factorization is the standard numerical method for solving least-squares problems. It is more numerically stable than forming A^T*A directly.
Check Your Understanding
1. Apply Gram-Schmidt to x1 = (1, 0), x2 = (3, 4).
2. Why does Gram-Schmidt require the input vectors to be linearly independent?
3. In QR factorization, Q has orthonormal columns. What is Q^T * Q?
Key Takeaways
- Gram-Schmidt: subtract projections onto previous vectors to build orthogonal basis
- Each new vector = original - sum of projections onto all previous orthogonal vectors
- Normalize to get an orthonormal basis: divide each vector by its length
- QR factorization: A = QR where Q has orthonormal columns, R is upper triangular
- QR is the practical way to implement Gram-Schmidt and solve least squares