Permutation Test¶

Formula¶

\[ p\text{-value} \approx \frac{1+\sum_{b=1}^{B}\mathbf{1}[T^{*(b)}\ge T_{obs}]}{B+1} \]

Parameters¶

\(T_{obs}\): observed test statistic
\(T^{*(b)}\): statistic after label shuffling

What it means¶

A permutation test builds a null distribution by shuffling labels/assignments under exchangeability.

What it's used for¶

Nonparametric hypothesis testing for mean differences, correlations, and model comparisons.
A/B tests when parametric assumptions are questionable.

Key properties¶

Directly approximates the null for the chosen statistic.
Interpretation still depends on valid exchangeability assumptions.

Common gotchas¶

Do not permute when structure (time, pairs, clusters) breaks exchangeability.
Monte Carlo error decreases as permutations increase.

Example¶

Shuffle treatment labels many times to test whether observed uplift exceeds chance.

How to Compute (Pseudocode)¶

Input: data split into groups, test statistic T, number of permutations B
Output: permutation p-value

compute observed statistic T_obs
count_extreme <- 0
for b from 1 to B:
  permute labels/assignments under the null
  compute T_star on permuted data
  if T_star is at least as extreme as T_obs:
    count_extreme <- count_extreme + 1
return (1 + count_extreme) / (B + 1)

Complexity¶

Time: \(O(B \cdot \mathrm{StatCost})\), where \(\mathrm{StatCost}\) is the cost to recompute the test statistic after one permutation
Space: \(O(1)\) extra if statistics are streamed, or \(O(B)\) if storing all null statistics
Assumptions: Valid exchangeability under the null; \(B\) controls Monte Carlo error and runtime