Skip to content

Hessian

Formula

\[ H_f(x)=\left[\frac{\partial^2 f}{\partial x_i \partial x_j}\right]_{i,j} \]

Parameters

  • \(f:\mathbb{R}^n\to\mathbb{R}\): scalar-valued function
  • \(H_f(x)\): matrix of second partial derivatives

What it means

The Hessian captures local curvature of a scalar function.

What it's used for

  • Newton-type optimization methods.
  • Determining curvature and classifying critical points.

Key properties

  • Symmetric when mixed partials are equal (smoothness conditions).
  • Positive definite Hessian implies local strict convexity (locally).

Common gotchas

  • Computing full Hessians is expensive in high dimensions.
  • Indefinite Hessians can cause Newton steps to fail without damping.

Example

For \(f(x,y)=x^2+3y^2\), \(H_f=\begin{bmatrix}2&0\\0&6\end{bmatrix}\).

How to Compute (Pseudocode)

Input: scalar function f: R^n -> R, point x
Output: Hessian matrix H (n x n)

for i from 1 to n:
  for j from 1 to n:
    H[i,j] <- second partial derivative d^2 f / (d x_i d x_j) evaluated at x

return H

Complexity

  • Time: \(O(n^2)\) second-derivative evaluations at a high level
  • Space: \(O(n^2)\) to store the Hessian matrix
  • Assumptions: Excludes the internal cost of each derivative evaluation; autodiff/Hessian-vector-product methods may avoid forming the full Hessian

See also