False Discovery Rate (FDR)¶

Formula¶

\[ \mathrm{FDR}=\mathbb{E}\left[\frac{V}{\max(R,1)}\right] \]

Parameters¶

\(V\): number of false rejections
\(R\): total number of rejections

What it means¶

FDR is the expected fraction of rejected hypotheses that are false positives.

What it's used for¶

Large-scale testing where some false positives are tolerable but should be controlled.
Reporting discoveries in exploratory analysis.

Key properties¶

Less conservative than strict family-wise error control.
Commonly controlled using Benjamini-Hochberg.

Common gotchas¶

FDR is an expectation, not a guarantee for one experiment.
Interpretation depends on how hypotheses are defined and selected.

Example¶

In gene screening, controlling FDR at 5% balances discovery count and false positives.

How to Compute (Pseudocode)¶

Input: set of hypotheses/p-values and a target error-rate criterion
Output: adjusted decisions or error-rate summary

collect p-values from the hypothesis family
apply the chosen multiple-testing/error-rate control procedure
report adjusted decision threshold(s), rejections, or error-rate summary
return results

Complexity¶

Time: Depends on the procedure; many standard methods are dominated by sorting (\(O(m\log m)\) for \(m\) hypotheses)
Space: \(O(m)\) for p-values and adjusted decisions/ordering
Assumptions: Hypotheses are treated as a specified family and the chosen procedure's assumptions determine validity