The basic idea of gradient descent is that a function \( F(\mathbf{x}) \), \( \mathbf{x} \equiv (x_1,\cdots,x_n) \), decreases fastest if one goes from \( \bf {x} \) in the direction of the negative gradient \( -\nabla F(\mathbf{x}) \).
It can be shown that if
$$ \mathbf{x}_{k+1} = \mathbf{x}_k - \gamma_k \nabla F(\mathbf{x}_k), $$with \( \gamma_k > 0 \).
For \( \gamma_k \) small enough, then \( F(\mathbf{x}_{k+1}) \leq F(\mathbf{x}_k) \). This means that for a sufficiently small \( \gamma_k \) we are always moving towards smaller function values, i.e a minimum.