Normalized Mutual Information (NMI)¶

Formula¶

\[ \mathrm{NMI}(U,V)=\frac{I(U;V)}{\sqrt{H(U)\,H(V)}} \]

Parameters¶

\(U,V\): two cluster labelings / partitions
\(I(U;V)\): mutual information between partitions
\(H(U),H(V)\): entropies of the partitions

What it means¶

Measures how much information two clusterings share, normalized to reduce scale effects.

What it's used for¶

External clustering evaluation against reference labels.
Comparing agreement across runs/algorithms.

Key properties¶

Usually in \([0,1]\), with \(1\) indicating identical partitions (up to permutation)
Invariant to label permutation
Several normalization variants exist

Common gotchas¶

Libraries may use different normalizations (geometric mean, arithmetic mean, max).
Compare scores only when using the same NMI definition.

Example¶

If two partitions are identical up to label names, NMI is 1.

How to Compute (Pseudocode)¶

Input: true labels and predicted labels (or sets/masks, depending on the metric)
Output: NMI score

build the contingency table / overlap counts needed by the metric
compute the metric numerator and denominator from those counts
apply any normalization/adjustment terms required by the definition
return the score

Complexity¶

Time: Typically \(O(n)\) to accumulate counts over \(n\) labeled examples once labels/sets are aligned (plus optional \(O(k^2)\) work on contingency tables for some metrics)
Space: Depends on the contingency-table size (from \(O(1)\) count accumulators to \(O(k_1 k_2)\) for label-table storage)
Assumptions: Exact complexity depends on binary-mask vs multiclass-label formulation and whether pair-count terms are computed from counts or explicit pairs