Mean (Expected Value)¶
Parameters¶
- \(X\): random variable
- \(x_i\): samples
- \(n\): number of samples
What it means¶
Average value of a distribution or dataset.
What it's used for¶
- Summarizing central tendency.
- Baseline predictor in regression.
Key properties¶
- Linear: \(\mathbb{E}[aX+b]=a\mathbb{E}[X]+b\).
- Minimizes expected squared error.
Common gotchas¶
- Sensitive to outliers.
- For heavy-tailed distributions, the mean may not exist.
Example¶
For samples \([1, 2, 4]\), \(\bar x=(1+2+4)/3=2.333\).
How to Compute (Pseudocode)¶
Input: sample data (and any reference values needed by the statistic)
Output: statistic value
compute the summary quantities required by the formula (for example, mean, deviations, counts)
apply the statistic formula from the card
return the result
Complexity¶
- Time: Typically \(O(n)\) for \(n\) samples for common one-pass or two-pass summary-statistic computations (sorting-based medians are \(O(n\log n)\) unless selection is used)
- Space: \(O(1)\) to \(O(n)\) depending on whether values must be stored/sorted
- Assumptions: Sample-statistic workflow shown; parameter-estimation and streaming/online algorithms can change constants and memory usage