PCA Explained Variance Ratio¶
Formula¶
\[
\mathrm{EVR}_k = \frac{\lambda_k}{\sum_{j=1}^{p}\lambda_j}
\]
Plot¶
type: bars
xs: 1 | 2 | 3 | 4 | 5
ys: 0.55 | 0.22 | 0.11 | 0.07 | 0.05
xmin: 0.5
xmax: 5.5
ymin: 0
ymax: 0.6
height: 280
title: Example explained variance ratios
Parameters¶
- \(\lambda_k\): \(k\)-th eigenvalue of covariance matrix
- \(p\): number of features/components
What it means¶
Explained variance ratio tells how much total variance is captured by each principal component.
What it's used for¶
- Choosing the number of components in PCA.
- Communicating compression tradeoffs.
Key properties¶
- Ratios sum to 1 across all components.
- Cumulative explained variance is often more useful than per-component values.
Common gotchas¶
- Large variance is not always equal to task-relevant signal.
- Scaling features changes the covariance spectrum.
Example¶
If the first two components explain 85% variance, a 2D projection may preserve much of the spread.
How to Compute (Pseudocode)¶
Input: PCA eigenvalues lambda[1..p] (usually sorted descending)
Output: explained variance ratios EVR[1..p]
total <- sum_{j=1..p} lambda[j]
for k from 1 to p:
EVR[k] <- lambda[k] / total
return EVR
Complexity¶
- Time: \(O(p)\) once PCA eigenvalues are available
- Space: \(O(p)\) for the ratio vector
- Assumptions: \(p\) is the number of components; the cost of fitting PCA/eigendecomposition is excluded