Skip to content

Data Science Field Guide

Mean Squared Error (MSE)

Mean Squared Error (MSE)¶

Formula¶

\[ \mathrm{MSE} = \frac{1}{n}\sum_{i=1}^n (y_i-\hat y_i)^2 \]

Plot¶

fn: x^2
xmin: -3
xmax: 3
ymin: 0
ymax: 9.5
height: 280
title: Squared error vs residual r

Parameters¶

\(y_i\): true value
\(\hat y_i\): prediction
\(n\): number of samples

What it means¶

Average squared prediction error.

What it's used for¶

Regression model evaluation and training loss.
Penalizing large errors more than small ones.

Key properties¶

Nonnegative; 0 means perfect prediction.
Sensitive to outliers due to squaring.

Common gotchas¶

Units are squared; compare carefully across scales.
Not robust to heavy-tailed noise.

Example¶

If errors are \([-1, 2, 0]\), \(\mathrm{MSE}=(1+4+0)/3=1.667\).

How to Compute (Pseudocode)¶

Input: true values y[1..n], predictions y_hat[1..n]
Output: mse

sum_sq <- 0
for i from 1 to n:
  residual <- y[i] - y_hat[i]
  sum_sq <- sum_sq + residual^2

mse <- sum_sq / n
return mse

Complexity¶

Time: \(O(n)\)
Space: \(O(1)\) additional space
Assumptions: \(n\) is the number of paired predictions/targets