Skip to content

Mean Absolute Error (MAE)

Formula

\[ \mathrm{MAE} = \frac{1}{n}\sum_{i=1}^n |y_i-\hat y_i| \]

Plot

fn: abs(x)
xmin: -3
xmax: 3
ymin: 0
ymax: 3.2
height: 280
title: Absolute error vs residual r

Parameters

  • \(y_i\): true value
  • \(\hat y_i\): prediction
  • \(n\): number of samples

What it means

Average absolute prediction error.

What it's used for

  • Robust regression error metric.
  • Comparing models when outliers should not dominate.

Key properties

  • Same units as the target.
  • Less sensitive to outliers than MSE/RMSE.

Common gotchas

  • Absolute value makes gradients nondifferentiable at 0 (subgradient works).
  • Can under-penalize large errors vs MSE.

Example

If errors are \([-1, 2, 0]\), \(\mathrm{MAE}=(1+2+0)/3=1.000\).

How to Compute (Pseudocode)

Input: true values y[1..n], predictions y_hat[1..n]
Output: MAE

sum_abs <- 0
for i from 1 to n:
  sum_abs <- sum_abs + abs(y[i] - y_hat[i])
return sum_abs / n

Complexity

  • Time: \(O(n)\)
  • Space: \(O(1)\) extra space
  • Assumptions: \(n\) paired predictions/targets; prediction-generation cost is excluded