Skip to content

Concept Drift

Formula

\[ P_t(Y\mid X) \ne P_{t'}(Y\mid X) \]

Parameters

  • \(P_t(Y\mid X)\): label relationship at time \(t\)

What it means

Concept drift means the relationship between features and outcomes changes over time.

What it's used for

  • Monitoring and retraining predictive models.
  • Detecting changing fraud/adversarial behavior or policy changes.

Key properties

  • More damaging than pure data drift because the model mapping itself becomes stale.
  • Can be abrupt, gradual, recurring, or seasonal.

Common gotchas

  • Needs delayed labels to confirm in many systems.
  • Performance drops can also come from pipeline bugs, not drift.

Example

Fraudsters change tactics, so old transaction patterns no longer predict fraud as well.

How to Compute (Pseudocode)

Input: time-stamped model predictions/features and delayed labels (if available)
Output: concept-drift monitoring signals

monitor model performance over time windows when labels arrive
compare recent conditional behavior/performance to a baseline period
rule out obvious pipeline/data-quality issues
if sustained degradation suggests P(Y|X) changed:
  trigger investigation and retraining/model update workflow

Complexity

  • Time: Depends on monitoring cadence, window sizes, and metric computations (often periodic \(O(n_w)\) window scans plus alerting logic)
  • Space: Depends on retained prediction/label history and summary dashboards
  • Assumptions: Label-delayed monitoring workflow; true concept-drift confirmation often depends on business-specific feedback loops