Skip to content

False Discovery Rate (FDR)

Formula

\[ \mathrm{FDR}=\mathbb{E}\left[\frac{V}{\max(R,1)}\right] \]

Parameters

  • \(V\): number of false rejections
  • \(R\): total number of rejections

What it means

FDR is the expected fraction of rejected hypotheses that are false positives.

What it's used for

  • Large-scale testing where some false positives are tolerable but should be controlled.
  • Reporting discoveries in exploratory analysis.

Key properties

  • Less conservative than strict family-wise error control.
  • Commonly controlled using Benjamini-Hochberg.

Common gotchas

  • FDR is an expectation, not a guarantee for one experiment.
  • Interpretation depends on how hypotheses are defined and selected.

Example

In gene screening, controlling FDR at 5% balances discovery count and false positives.

How to Compute (Pseudocode)

Input: set of hypotheses/p-values and a target error-rate criterion
Output: adjusted decisions or error-rate summary

collect p-values from the hypothesis family
apply the chosen multiple-testing/error-rate control procedure
report adjusted decision threshold(s), rejections, or error-rate summary
return results

Complexity

  • Time: Depends on the procedure; many standard methods are dominated by sorting (\(O(m\log m)\) for \(m\) hypotheses)
  • Space: \(O(m)\) for p-values and adjusted decisions/ordering
  • Assumptions: Hypotheses are treated as a specified family and the chosen procedure's assumptions determine validity

See also