Differential Theory

3 minute read

Published: December 21, 2021

This post summarizes differential theory on convex functions, including directional derivatives, subgradients, and Legendre transformation.

Characterization of Smoothness

Directional Derivatives

The directional derivative of $f$ at $x$ in the direction $d$ is defined as $$\nabla_d f(x)=\lim\limits_{t\rightarrow 0+}\frac{f(x+td)-f(x)}{t}$$

If $f$ is finite-value and convex, then the directional derivative for fixed $x$ is finite sublinear with respect to $d$.

Subgradients

The subdifferential $\partial f (x)$ of $f$ at $x$ is the nonempty compact convex set of $\mathbb{R}^n$ whose support function is the directional derivative, i.e. $$\partial f(x)=\{s\in\mathbb{R}^n\mid \langle s, d\rangle \leq \nabla_d f(x),\ \forall\ d\in\mathbb{R}^n\}$$ Any vector $s\in \partial f(x)$ is called a subgradient of $x$.

The subgradient can also be defined via the subgradient inequality. We say $s$ is a subgradient of $f$ at $x$ if $$f(y)\geq f(x)+ \langle s, y-x\rangle,\quad \forall\ z\in\mathbb{R}^n.$$

Property

(Multiplicability) If $f$ is a proper convex function on $\mathbb{R}^n$, then $\partial(\lambda f)=\lambda \partial f,\ \forall\ x\in\mathbb{R}^n,\lambda>0.$
(Additivity) If $f_1$ and $f_2$\text{ri}( are proper convex functions on $\mathbb{R}^n$, then $\partial(f_1+f_2)\supseteq \partial f_1+ \partial f_2,\ \forall\ x\in\mathbb{R}^n$. If the convex sets $\text{ri}(\text{dom} f_1)\cap \text{ri}(\text{dom} f_1)\neq\varnothing$, then $\partial(f_1+f_2)= \partial f_1+ \partial f_2,\ \forall\ x\in\mathbb{R}^n$.
(Affine Composition) Let $f(x)=h(Ax)$ where $h$ is a proper convex function on $\mathbb{R}^m$ and $A$ is a linear transform from $\mathbb{R}^n$ to $\mathbb{R}^m$. Then $\partial f\supset A^{\ast}\partial h(Ax),\ \forall\ x\in\mathbb{R}^n$. If $\text{Im}(A)\cap \text{ri}(\text{dom}(h))\neq\varnothing$ or if $h$ is polyhedral and $\text{Im}(A)\cap \text{dom}(h)\neq\varnothing$, then $\partial f= A^{\ast}\partial h(Ax),\ \forall\ x\in\mathbb{R}^n$.

Essential Smoothness and Essential Strictly Convexity

A proper convex function $f$ on $\mathbb{R}^n$ is essentially smooth if it satisfies the following conditions for $C=\text{dom}(f)$:

$C\neq\varnothing$;
$f$ is differentiable on $C$;
$\lim\limits_{\substack{x\rightarrow x_0\\ x\in C}}|\nabla f(x)|=+\infty$ for all $x_0\in\text{bd}(f)$, which is equivalent to $\partial f(x_0)=\varnothing$ for all $x_0\in\text{bd}(f)$.

The subtle difference between essential smoothness and smoothness lies in the boundary behavior of the gradient. A smooth convex function on $\mathbb{R}^n$ is in particular essentially smooth.

(Essential Smoothness and Subgradient) If $f$ is a closed proper convex function, then $f$ is essentially smooth if and only if $$\partial f=\begin{cases}{\nabla f}&\quad x\in\text{int}(\text{dom}(f)),\\\varnothing&,\quad x\not\in\text{int}(\text{dom}(f)).\end{cases}$$

A proper convex function $f$ on $\mathbb{R}^n$ is essentially smooth if $f$ is strictlt convex on any convex subset of $\text{dom}(\partial f)=\{x\mid \partial f(x)\neq \varnothing\}$.

Smoothness and Strictly Convexity

The duality between essential smoothness and esstential strictly convexity can be characterized as follows.

(Essential Smoothness and Essential Strictly Convexity) A closed proper convex function $f$ is esstentially strictly convex if and only if its convex conjugate $f^{\ast}$ is essentially smooth.

(Subgradient) A closed proper convex function $f$ has one-to-one subgradient mappings if and only if $f$ and $f^{\ast}$ are essentially smooth; if and only if $f$ is strictly convex on $\text{int}(\text{dom}(f))$ and is essentially smooth.

The Legendre Transformation

Note that from the definition of convex conjugate $$ \begin{align*} f^{\ast}(u) &= \sup\limits_{x\in\mathbb{R}^n} x^{\top}u - f(x) \end{align*}$$ If $f$ is differentiable (on its domain), we take the derivative and set to zero: $$u-\nabla f(x)=0.$$ If $\nabla f$ is a one-to-one mapping, then $x=[\nabla f]^{-1}(u)$ obtains the supremum. Thus, $$f^{\ast}(u) = \langle [\nabla f]^{-1}(u), u \rangle- f([\nabla f]^{-1}(u)),$$ which is called the Legendre conjugate.

Share on

Twitter Facebook LinkedIn

Differential Theory

Characterization of Smoothness

Directional Derivatives

Subgradients

Essential Smoothness and Essential Strictly Convexity

Smoothness and Strictly Convexity

The Legendre Transformation

Share on

You May Also Enjoy

Academic Job Interview

Screening interview questions

Python Handbook

Conda

Norm Inequality

Divergence Measures