Overview video on Stochastic Gradient Descent

What is Stochastic Gradient Descent

There are several reasons for using stochastic gradient descent. Some of these are:

  1. Efficiency: Updates weights more frequently using a single or a small batch of samples, which speeds up convergence.
  2. Hopefully avoid Local Minima
  3. Memory Usage: Requires less memory compared to computing gradients for the entire dataset.