Permutation Test¶
Formula¶
\[
p\text{-value} \approx \frac{1+\sum_{b=1}^{B}\mathbf{1}[T^{*(b)}\ge T_{obs}]}{B+1}
\]
Parameters¶
- \(T_{obs}\): observed test statistic
- \(T^{*(b)}\): statistic after label shuffling
What it means¶
A permutation test builds a null distribution by shuffling labels/assignments under exchangeability.
What it's used for¶
- Nonparametric hypothesis testing for mean differences, correlations, and model comparisons.
- A/B tests when parametric assumptions are questionable.
Key properties¶
- Directly approximates the null for the chosen statistic.
- Interpretation still depends on valid exchangeability assumptions.
Common gotchas¶
- Do not permute when structure (time, pairs, clusters) breaks exchangeability.
- Monte Carlo error decreases as permutations increase.
Example¶
Shuffle treatment labels many times to test whether observed uplift exceeds chance.
How to Compute (Pseudocode)¶
Input: data split into groups, test statistic T, number of permutations B
Output: permutation p-value
compute observed statistic T_obs
count_extreme <- 0
for b from 1 to B:
permute labels/assignments under the null
compute T_star on permuted data
if T_star is at least as extreme as T_obs:
count_extreme <- count_extreme + 1
return (1 + count_extreme) / (B + 1)
Complexity¶
- Time: \(O(B \cdot \mathrm{StatCost})\), where \(\mathrm{StatCost}\) is the cost to recompute the test statistic after one permutation
- Space: \(O(1)\) extra if statistics are streamed, or \(O(B)\) if storing all null statistics
- Assumptions: Valid exchangeability under the null; \(B\) controls Monte Carlo error and runtime