Skip to content

Label Propagation (LPA)

Formula

\[ \ell_i \leftarrow \arg\max_c \sum_j A_{ij}\,\mathbf{1}[\ell_j=c] \]

Parameters

  • \(A_{ij}\): adjacency (or weight) between nodes \(i,j\)
  • \(\ell_i\): label of node \(i\)
  • \(c\): candidate label

What it means

Iteratively updates node labels to match the most frequent label in the neighborhood.

What it's used for

  • Fast, scalable community detection.
  • Initializing or refining graph partitions.

Key properties

  • Near-linear time per iteration.
  • Often converges to a partition without specifying the number of communities.

Common gotchas

  • Results depend on update order and tie-breaking.
  • Can yield a single giant community on some graphs.

Example

If node \(i\) has neighbors labeled \([2,2,3]\), then \(\ell_i\leftarrow 2\).

How to Compute (Pseudocode)

Input: graph G=(V,E), max iterations T
Output: node labels l[.]

initialize each node with a unique label

for iter from 1 to T:
  changed <- false
  for each node i (often in random order):
    choose the most frequent neighbor label around i
    break ties by a fixed rule or randomly
    if new label differs from l[i]:
      l[i] <- new label
      changed <- true
  if not changed:
    break

return labels l

Complexity

  • Time: Typically \(O(T|E|)\) for \(T\) passes over neighborhoods
  • Space: \(O(|V|+|E|)\) including graph storage and labels
  • Assumptions: Neighborhood scans dominate cost; runtime and output depend on update order and tie-breaking strategy