Gradient Methods
Contents
Overview of week March 3-7
Teaching Material, videos and written material
Brief reminder on Newton-Raphson's method
The equations
Small values
Simple geometric interpretation
Extending to more than one variable
Jacobian
Inverse of the Jacobian
Steepest descent
More on Steepest descent
The ideal
The sensitiveness of the gradient descent
Convex function
Conditions on convex functions
Second condition
More on convex functions
Some simple problems
Broyden’s Algorithm for Solving Nonlinear Equations
Problem Formulation
Just a short reminder of Newton’s Method
Broyden’s Method
Broyden’s Good Method
Broyden’s Bad Method
Algorithm Steps
Advantages and Limitations
Applications
Broyden–Fletcher–Goldfarb–Shanno algorithm
BFGS optimization problem
BFGS optimization problem, setting up the equations
BFGS Algorithm Overview
Final steps
Convergence and Termination Criteria
Properties of BFGS
Final words on the BFGS
Standard steepest descent
Gradient method
Steepest descent method
Steepest descent method
Final expressions
Conjugate gradient method
Conjugate gradient method
Conjugate gradient method
Conjugate gradient method
Conjugate gradient method and iterations
Conjugate gradient method
Conjugate gradient method
Conjugate gradient method
Using gradient descent methods, limitations
Improving gradient descent with momentum
Same code but now with momentum gradient descent
Overview video on Stochastic Gradient Descent
Batches and mini-batches
Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent
Computation of gradients
SGD example
The gradient step
Simple example code
When do we stop?
Slightly different approach
Time decay rate
Code with a Number of Minibatches which varies
Replace or not
Momentum based GD
More on momentum based approaches
Momentum parameter
Second moment of the gradient
RMS prop
"ADAM optimizer":"https://arxiv.org/abs/1412.6980"
Algorithms and codes for Adagrad, RMSprop and Adam
Practical tips
Automatic differentiation
Using autograd
Autograd with more complicated functions
More complicated functions using the elements of their arguments directly
Functions using mathematical functions from Numpy
More autograd
And with loops
Using recursion
Unsupported functions
The syntax a.dot(b) when finding the dot product
Recommended to avoid
Using Autograd with OLS
Same code but now with momentum gradient descent
But noen of these can compete with Newton's method
Including Stochastic Gradient Descent with Autograd
Same code but now with momentum gradient descent
Similar (second order function now) problem but now with AdaGrad
RMSprop for adaptive learning rate with Stochastic Gradient Descent
And finally "ADAM":"https://arxiv.org/pdf/1412.6980.pdf"
And Logistic Regression
Introducing "JAX":"https://jax.readthedocs.io/en/latest/"
Overview of week March 3-7
Semi-Newton methods (Broyden's algorithm and Broyden-Farberg-Goldberg-Shanno algorithm)
Steepest descent and conjugate gradient descent
Stochastic gradient descent and variants thereof
Automatic differentiation
«
1
2
3
4
5
6
7
8
9
10
11
...
94
»