Skip to content

Data Science Field Guide

Modularity (Community Quality)

Modularity¶

Formula¶

\[ Q = \frac{1}{2m}\sum_{i,j}\left(A_{ij} - \frac{k_i k_j}{2m}\right)\,\mathbf{1}[c_i=c_j] \]

Parameters¶

\(A_{ij}\): edge weight between \(i\) and \(j\) (0/1 if unweighted)
\(k_i\): degree (or weighted degree/strength)
\(m\): number of edges (or total weight / 2)
\(c_i\): community label

What it means¶

Compares actual within-community edges to the expected count under a degree-preserving “null model”.

What it's used for¶

Evaluating the quality of a community assignment.
Objective optimized by Louvain/Leiden algorithms.

Key properties¶

Higher \(Q\) suggests stronger community structure
Often optimized by Louvain/Leiden heuristics

Common gotchas¶

Resolution limit: modularity can miss small communities in large graphs
There are variants with a resolution parameter \(\gamma\): [ A_{ij} - \gamma\frac{k_i k_j}{2m} ]

Example¶

Graph with edges (1,2) and (3,4), communities {1,2} and {3,4}: \(m=2\), all degrees 1, and \(Q=0.5\).

How to Compute (Pseudocode)¶

Input: graph G, community labels c[1..n]
Output: modularity score Q

compute node degrees k_i and total edge count/weight m
Q <- 0
for each pair (i, j) (or each edge/nonzero term in an optimized implementation):
  if c[i] == c[j]:
    Q <- Q + (A[i,j] - (k_i * k_j) / (2m))
Q <- Q / (2m)
return Q

Complexity¶

Time: Depends on the implementation and representation; dense formula evaluation is \(O(n^2)\), while sparse/community-aggregated implementations can be much faster in practice
Space: Depends on graph representation; dense adjacency is \(O(n^2)\), sparse storage is typically \(O(|V|+|E|)\)
Assumptions: Undirected modularity formula shown; weighted/directed/resolution-parameter variants modify the computation