Concept Drift¶

Formula¶

\[ P_t(Y\mid X) \ne P_{t'}(Y\mid X) \]

Parameters¶

\(P_t(Y\mid X)\): label relationship at time \(t\)

What it means¶

Concept drift means the relationship between features and outcomes changes over time.

What it's used for¶

Monitoring and retraining predictive models.
Detecting changing fraud/adversarial behavior or policy changes.

Key properties¶

More damaging than pure data drift because the model mapping itself becomes stale.
Can be abrupt, gradual, recurring, or seasonal.

Common gotchas¶

Needs delayed labels to confirm in many systems.
Performance drops can also come from pipeline bugs, not drift.

Example¶

Fraudsters change tactics, so old transaction patterns no longer predict fraud as well.

How to Compute (Pseudocode)¶

Input: time-stamped model predictions/features and delayed labels (if available)
Output: concept-drift monitoring signals

monitor model performance over time windows when labels arrive
compare recent conditional behavior/performance to a baseline period
rule out obvious pipeline/data-quality issues
if sustained degradation suggests P(Y|X) changed:
  trigger investigation and retraining/model update workflow

Complexity¶

Time: Depends on monitoring cadence, window sizes, and metric computations (often periodic \(O(n_w)\) window scans plus alerting logic)
Space: Depends on retained prediction/label history and summary dashboards
Assumptions: Label-delayed monitoring workflow; true concept-drift confirmation often depends on business-specific feedback loops