Scree Plot¶

Formula¶

\[ \text{scree plot: } k \mapsto \lambda_k \quad \text{or} \quad k \mapsto \sum_{j=1}^{k}\mathrm{EVR}_j \]

Plot¶

type: bars
xs: 1 | 2 | 3 | 4 | 5 | 6
ys: 0.42 | 0.24 | 0.14 | 0.09 | 0.06 | 0.05
xmin: 0.5
xmax: 6.5
ymin: 0
ymax: 0.48
height: 280
title: Example scree plot (explained variance by component)

Parameters¶

\(k\): component index
\(\lambda_k\): ordered eigenvalue

What it means¶

A scree plot visualizes component importance to help choose a truncation point in PCA/factor methods.

What it's used for¶

Selecting a practical number of principal components.
Explaining diminishing returns from additional components.

Key properties¶

Often look for an "elbow" where gains flatten.
Can plot eigenvalues or cumulative explained variance.

Common gotchas¶

Elbows can be subjective.
Task performance should still validate the dimensionality choice.

Example¶

If cumulative variance jumps to 95% by component 20 and then flattens, 20 is a reasonable candidate.

How to Compute (Pseudocode)¶

Input: ordered eigenvalues or explained-variance ratios
Output: scree plot data points (and optionally cumulative curve)

for k from 1 to K:
  y[k] <- eigenvalue[k] or explained_variance_ratio[k]
  cumulative[k] <- sum_{j=1..k} explained_variance_ratio[j]   # optional
plot k vs y (and/or cumulative[k])
return plot data

Complexity¶

Time: \(O(K)\) once eigenvalues/EVR values are available
Space: \(O(K)\) for plot vectors
Assumptions: Cost of PCA/eigendecomposition is excluded; this card covers postprocessing/visualization preparation