Week 38: Logistic Regression and Optimization

Loading [MathJax]/extensions/TeX/boldsymbol.js

Steepest descent

The basic idea of gradient descent is that a function $F(\mathbf{x})$ , $\mathbf{x} \equiv (x_1,\cdots,x_n)$ , decreases fastest if one goes from $\bf {x}$ in the direction of the negative gradient $-\nabla F(\mathbf{x})$ .

It can be shown that if

$\mathbf{x}_{k+1} = \mathbf{x}_k - \gamma_k \nabla F(\mathbf{x}_k),$

with $\gamma_k > 0$ .

For $\gamma_k$ small enough, then $F(\mathbf{x}_{k+1}) \leq F(\mathbf{x}_k)$ . This means that for a sufficiently small $\gamma_k$ we are always moving towards smaller function values, i.e a minimum.