Probability Distribution¶

Formula¶

\[ \text{Distribution of }X \text{ is the rule that gives } P(X \in A) \]

Parameters¶

\(X\): random variable
\(A\): event/set of values

What it means¶

A distribution specifies how probability mass or density is assigned across the possible values of a random variable.

What it's used for¶

Computing probabilities and expectations.
Choosing statistical models (e.g., Bernoulli, Normal, Poisson).

Key properties¶

Discrete distributions use a PMF.
Continuous distributions use a PDF (plus integration).
The CDF works for both discrete and continuous cases.

Common gotchas¶

A PDF value is not itself a probability.
The same family (e.g., Normal) can represent many distributions with different parameters.

Example¶

For a fair die roll \(X\in\{1,\dots,6\}\), the distribution is \(P(X=k)=1/6\) for each \(k\).

How to Compute (Pseudocode)¶

Input: random variable X or data/model assumptions
Output: probability distribution representation

choose a representation (PMF, PDF, CDF, parametric family, or empirical distribution)
estimate or specify the required parameters/rules
return the distribution object/representation

Complexity¶

Time: Depends on whether the distribution is specified analytically or estimated from data (empirical estimation is often linear in sample size)
Space: Depends on representation (parametric parameters vs histogram/empirical tables)
Assumptions: This is a modeling/representation workflow; downstream PMF/PDF/CDF computations determine concrete costs