Lab 11: Projection — The Best Shadow

Central idea

Projection replaces a vector by the closest vector inside a chosen space. The leftover error is called the residual. At the best approximation, the residual points perpendicular to the approximation space.

y = ŷ + r, ŷ ∈ S, r ⟂ S

best approximationorthogonal residualleast squarescompression

1. Project a vector onto a line

y₁

y₂

u₁

u₂

2. Error curve: why this coefficient is best

Every point on the line has the form cu. The best projection coefficient minimizes ||y - cu||².

3. Least-squares line fitting

Drag the noise and outlier sliders. The fitted line is the projection of the response vector onto the column space of the design matrix.

noise

outlier strength

4. Projection matrix test

A projection matrix satisfies P² = P. Projecting twice is the same as projecting once.

direction angle

5. Signal approximation

Approximate a signal using only a few sine waves. More basis functions mean a richer shadow space.

number of basis waves

6. High-dimensional projection

For a random vector in n dimensions, projection onto a k-dimensional random subspace keeps about k/n of squared length.

ambient dimension n

subspace dimension k

Reflection prompts

Why is the residual perpendicular at the best approximation?
What is the difference between solving exactly and solving by least squares?
In a data model, what does the projection represent? What does the residual represent?
Why does projection prepare the way for PCA and compression?