Hessian¶

Formula¶

\[ H_f(x)=\left[\frac{\partial^2 f}{\partial x_i \partial x_j}\right]_{i,j} \]

Parameters¶

\(f:\mathbb{R}^n\to\mathbb{R}\): scalar-valued function
\(H_f(x)\): matrix of second partial derivatives

What it means¶

The Hessian captures local curvature of a scalar function.

What it's used for¶

Newton-type optimization methods.
Determining curvature and classifying critical points.

Key properties¶

Symmetric when mixed partials are equal (smoothness conditions).
Positive definite Hessian implies local strict convexity (locally).

Common gotchas¶

Computing full Hessians is expensive in high dimensions.
Indefinite Hessians can cause Newton steps to fail without damping.

Example¶

For \(f(x,y)=x^2+3y^2\), \(H_f=\begin{bmatrix}2&0\\0&6\end{bmatrix}\).

How to Compute (Pseudocode)¶

Input: scalar function f: R^n -> R, point x
Output: Hessian matrix H (n x n)

for i from 1 to n:
  for j from 1 to n:
    H[i,j] <- second partial derivative d^2 f / (d x_i d x_j) evaluated at x

return H

Complexity¶

Time: \(O(n^2)\) second-derivative evaluations at a high level
Space: \(O(n^2)\) to store the Hessian matrix
Assumptions: Excludes the internal cost of each derivative evaluation; autodiff/Hessian-vector-product methods may avoid forming the full Hessian