Skip to content

Precision-Recall Curve

Formula

\[ \mathcal{C} = \{(\mathrm{Precision}(t),\,\mathrm{Recall}(t)) : t \in \mathbb{R}\} \]

Plot

fn: 1/(1+2*x)
xmin: 0
xmax: 1
ymin: 0.3
ymax: 1.05
height: 280
title: Example precision-recall curve (illustrative)

Parameters

  • \(t\): decision threshold

What it means

Tradeoff between precision and recall across thresholds.

What it's used for

  • Visualizing precision vs recall across thresholds.
  • Choosing operating points for imbalanced data.

Key properties

  • More informative than ROC for heavy class imbalance
  • Area under PR curve equals average precision for common definitions

Common gotchas

  • Baseline is the positive class prevalence, not 0.5.
  • Different interpolation conventions change AP.

Example

Two thresholds might give points \((\mathrm{Recall},\mathrm{Precision})=(0.9,0.6)\) and \((0.6,0.85)\).

How to Compute (Pseudocode)

Input: scores p_hat[1..n], labels y[1..n]
Output: PR curve points

sort examples by score descending
sweep a threshold from high to low through unique scores
at each threshold, update confusion-matrix counts incrementally
record the corresponding curve point (TPR/FPR for ROC or Precision/Recall for PR)
return all curve points

Complexity

  • Time: \(O(n\log n)\) due to sorting, plus a linear threshold sweep
  • Space: \(O(n)\) for sorted scores/labels and output curve points
  • Assumptions: Binary ranking scores; ties and interpolation conventions depend on the implementation

See also