Plans for week 37, lecture Monday

The family of gradient descent methods

  1. Plain gradient descent (constant learning rate), reminder from last week with examples using OLS and Ridge
  2. Improving gradient descent with momentum
  3. Introducing stochastic gradient descent
  4. More advanced updates of the learning rate: ADAgrad, RMSprop and ADAM
  5. Video of Lecture
  6. Whiteboard notes