Week 40: Gradient descent methods (continued) and start Neural networks

Loading [MathJax]/extensions/TeX/autobold.js

Overview video on Stochastic Gradient Descent

There are several reasons for using stochastic gradient descent. Some of these are:

Efficiency: Updates weights more frequently using a single or a small batch of samples, which speeds up convergence.
Hopefully avoid Local Minima
Memory Usage: Requires less memory compared to computing gradients for the entire dataset.