1. The PCA story
A dataset is a cloud of points. PCA asks which directions capture the most variation. The first principal component is the longest direction of the cloud. The second is the next-longest direction, perpendicular to the first.
centervariancecovarianceeigenvectorsSVDprojection2. Rotate a data cloud
The line you choose has a projected variance. PCA chooses the angle where this variance is largest.
3. Variance as a function of direction
The peak of this curve is the first principal component direction.
4. Keep only the first component
Keeping one component means projecting each point onto the best line. This is a compressed version of the data.
Question. When noise increases, does one component still explain most of the variation?
5. Scaling changes PCA
PCA follows variance. If a feature is numerically enlarged, it may dominate the result.
6. Scree plot and explained variance
Reflection prompts
- Why does PCA require centering?
- What is the difference between a principal direction and a principal score?
- Why is PCA a projection method?
- When should you standardize features before PCA?
- Why can a high-variance direction fail to be a good prediction direction?