Skip to content

Dice Coefficient (Sorensen-Dice)

Formula

\[ D(A,B) = \frac{2|A\cap B|}{|A|+|B|} \]
\[ D = \frac{2\mathrm{TP}}{2\mathrm{TP}+\mathrm{FP}+\mathrm{FN}} \]

Parameters

  • \(A, B\): sets (top equation)
  • \(\mathrm{TP}, \mathrm{FP}, \mathrm{FN}\): binary counts (bottom equation)

What it means

Overlap measure that doubles the intersection and normalizes by total size.

What it's used for

  • Segmentation and set prediction overlap.
  • Comparing binary masks when true negatives are less important.

Key properties

  • Range \([0,1]\); 1 means perfect overlap.
  • Symmetric: \(D(A,B)=D(B,A)\).
  • Relation to Jaccard: \(D=\frac{2J}{1+J}\), \(J=\frac{D}{2-D}\).
  • For binary classification, Dice equals \(F_1\).

Common gotchas

  • Undefined when both sets empty; choose a convention (often 1.0).
  • Ignores true negatives, so it can look high under strong class imbalance.

Example

If \(\mathrm{TP}=30, \mathrm{FP}=10, \mathrm{FN}=5\), \(D=2\cdot 30/(2\cdot 30+10+5)=0.800\).

How to Compute (Pseudocode)

Input: true labels and predicted labels (or sets/masks, depending on the metric)
Output: Dice score

build the contingency table / overlap counts needed by the metric
compute the metric numerator and denominator from those counts
apply any normalization/adjustment terms required by the definition
return the score

Complexity

  • Time: Typically \(O(n)\) to accumulate counts over \(n\) labeled examples once labels/sets are aligned (plus optional \(O(k^2)\) work on contingency tables for some metrics)
  • Space: Depends on the contingency-table size (from \(O(1)\) count accumulators to \(O(k_1 k_2)\) for label-table storage)
  • Assumptions: Exact complexity depends on binary-mask vs multiclass-label formulation and whether pair-count terms are computed from counts or explicit pairs

See also