Lesson 2: Introduction to the Singular Value Decomposition (SVD)

Estimated time: 45-55 minutes

Learning Objectives

State the SVD theorem: A = U Sigma V^T
Compute the SVD for small matrices step by step
Understand the geometric interpretation: rotate, stretch, rotate
Connect singular values to eigenvalues of A^T A

What Is the SVD?

Singular Value Decomposition: Every m x n matrix A can be factored as A = U * Sigma * V^T, where:

U is m x m orthogonal (columns are left singular vectors)
Sigma is m x n diagonal (entries are singular values sigma_1 ≥ sigma_2 ≥ ... ≥ 0)
V is n x n orthogonal (columns are right singular vectors)

Unlike diagonalization, the SVD exists for every matrix -- not just square ones, and not just diagonalizable ones.

Singular Values and Vectors

Singular Values: The singular values of A are sigma_i = sqrt(lambda_i), where lambda_i are the eigenvalues of A^T A (arranged in decreasing order).

Right singular vectors: Columns of V are the eigenvectors of A^T A (orthonormal).

Left singular vectors: Columns of U are the eigenvectors of A*A^T (orthonormal), or computed from u_i = (1/sigma_i) A v_i.

Computing the SVD: Step by Step

Step 1: Compute A^T A and find its eigenvalues and eigenvectors. The eigenvectors (orthonormalized) form V.

Step 2: The singular values are the square roots of the eigenvalues of A^T A.

Step 3: Compute U columns: u_i = (1/sigma_i) A v_i for each nonzero singular value. Extend to a full orthonormal set if needed.

Worked Example 1: SVD of a 2x2 Matrix

A = [3 0; 0 2].

Step 1: A^T A = [9 0; 0 4]. Eigenvalues: 9 and 4. V = I = [1 0; 0 1].

Step 2: sigma_1 = 3, sigma_2 = 2. Sigma = [3 0; 0 2].

Step 3: u1 = (1/3)A(1,0)^T = (1,0). u2 = (1/2)A(0,1)^T = (0,1). U = I.

A = I [3 0; 0 2] I^T. For diagonal matrices, the SVD is trivial!

Worked Example 2: SVD of a Rectangular Matrix

A = [1 1; 0 1; 1 0]. (3x2 matrix)

Step 1: A^T A = [1 0 1; 1 1 0][1 1; 0 1; 1 0] = [2 1; 1 2]. Eigenvalues: 3 and 1.

lambda = 3: v1 = (1/sqrt(2))(1, 1). lambda = 1: v2 = (1/sqrt(2))(1, -1).

Step 2: sigma_1 = sqrt(3), sigma_2 = 1.

Step 3: u1 = (1/sqrt(3)) A v1 = (1/sqrt(3))(1/sqrt(2))(2, 1, 1) = (1/sqrt(6))(2, 1, 1).

u2 = A v2 = (1/sqrt(2))(0, 1, -1) = wait, let me recompute. u2 = (1/1)(1/sqrt(2))(0, -1, 1) = (1/sqrt(2))(0, -1, 1).

u3 is chosen to complete the orthonormal basis for R^3: u3 = (1/sqrt(3))(1, -1, -1) (after orthogonalization).

Geometric Interpretation

The SVD reveals that every linear transformation can be decomposed into three simpler steps:

A = U Sigma V^T: first rotate/reflect (V^T), then stretch (Sigma), then rotate/reflect (U)

Geometry: A maps the unit sphere to an ellipse. The singular values are the semi-axis lengths of the ellipse. The right singular vectors are the input directions, and the left singular vectors are the output directions.

Properties and Connections

Key Properties:

rank(A) = number of nonzero singular values
||A|| (operator norm) = sigma_1 (the largest singular value)
||A||_F (Frobenius norm) = sqrt(sigma_1^2 + sigma_2^2 + ...)
The singular values of A are the square roots of the eigenvalues of A^T A

Low-Rank Approximation

Truncating the SVD to the first k singular values gives the best rank-k approximation of A.

Eckart-Young Theorem: A_k = sigma_1 u1 v1^T + sigma_2 u2 v2^T + ... + sigma_k uk vk^T is the closest rank-k matrix to A (in both Frobenius and operator norms).

Worked Example 3: Rank-1 Approximation

A = [3 0; 0 2]. sigma_1 = 3, u1 = (1,0), v1 = (1,0).

Best rank-1 approximation: A_1 = 3 * (1,0)^T * (1,0) = [3 0; 0 0].

Error: ||A - A_1||_F = 2 (the second singular value). The approximation captures 3/sqrt(13) is about 83% of the information.

Check Your Understanding

1. What are the three factors in the SVD and what are their properties?

Answer: A = U Sigma V^T. U is orthogonal (m x m), Sigma is diagonal (m x n) with nonnegative entries, V is orthogonal (n x n).

2. How do you find the singular values of A?

Answer: Compute eigenvalues of A^T A. The singular values are the square roots of these eigenvalues.

3. A has singular values 5, 3, 0. What is the rank of A?

Answer: rank = 2 (the number of nonzero singular values).

Key Takeaways

SVD: A = U Sigma V^T exists for every matrix
Singular values = sqrt(eigenvalues of A^T A)
Geometric meaning: rotate, stretch, rotate
rank(A) = number of nonzero singular values
Truncated SVD gives the best low-rank approximation
Foundation for data compression, noise reduction, and recommender systems

Next Lesson

PCA and Dimensionality Reduction.

Start Lesson 3

Module Home

Module 8 Home