Hessian¶
Formula¶
\[
H_f(x)=\left[\frac{\partial^2 f}{\partial x_i \partial x_j}\right]_{i,j}
\]
Parameters¶
- \(f:\mathbb{R}^n\to\mathbb{R}\): scalar-valued function
- \(H_f(x)\): matrix of second partial derivatives
What it means¶
The Hessian captures local curvature of a scalar function.
What it's used for¶
- Newton-type optimization methods.
- Determining curvature and classifying critical points.
Key properties¶
- Symmetric when mixed partials are equal (smoothness conditions).
- Positive definite Hessian implies local strict convexity (locally).
Common gotchas¶
- Computing full Hessians is expensive in high dimensions.
- Indefinite Hessians can cause Newton steps to fail without damping.
Example¶
For \(f(x,y)=x^2+3y^2\), \(H_f=\begin{bmatrix}2&0\\0&6\end{bmatrix}\).
How to Compute (Pseudocode)¶
Input: scalar function f: R^n -> R, point x
Output: Hessian matrix H (n x n)
for i from 1 to n:
for j from 1 to n:
H[i,j] <- second partial derivative d^2 f / (d x_i d x_j) evaluated at x
return H
Complexity¶
- Time: \(O(n^2)\) second-derivative evaluations at a high level
- Space: \(O(n^2)\) to store the Hessian matrix
- Assumptions: Excludes the internal cost of each derivative evaluation; autodiff/Hessian-vector-product methods may avoid forming the full Hessian