The basic idea of gradient descent is that a function F(x), x≡(x1,⋯,xn), decreases fastest if one goes from x in the direction of the negative gradient −∇F(x).
It can be shown that if
xk+1=xk−γk∇F(xk),with γk>0.
For γk small enough, then F(xk+1)≤F(xk). This means that for a sufficiently small γk we are always moving towards smaller function values, i.e a minimum.