Steepest descent

The basic idea of gradient descent is that a function \( F(\mathbf{x}) \), \( \mathbf{x} \equiv (x_1,\cdots,x_n) \), decreases fastest if one goes from \( \bf {x} \) in the direction of the negative gradient \( -\nabla F(\mathbf{x}) \).

It can be shown that if

$$ \mathbf{x}_{k+1} = \mathbf{x}_k - \gamma_k \nabla F(\mathbf{x}_k), $$

with \( \gamma_k > 0 \).

For \( \gamma_k \) small enough, then \( F(\mathbf{x}_{k+1}) \leq F(\mathbf{x}_k) \). This means that for a sufficiently small \( \gamma_k \) we are always moving towards smaller function values, i.e a minimum.