Week 40: Gradient descent methods (continued) and start Neural networks
Contents
Plans for week 40
Summary from last week, using gradient descent methods, limitations
Overview video on Stochastic Gradient Descent
Batches and mini-batches
Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent
Computation of gradients
SGD example
The gradient step
Simple example code
When do we stop?
Slightly different approach
Time decay rate
Code with a Number of Minibatches which varies
Replace or not
Momentum based GD
More on momentum based approaches
Momentum parameter
Second moment of the gradient
RMS prop
"ADAM optimizer":"https://arxiv.org/abs/1412.6980"
Algorithms and codes for Adagrad, RMSprop and Adam
Practical tips
Automatic differentiation
Using autograd
Autograd with more complicated functions
More complicated functions using the elements of their arguments directly
Functions using mathematical functions from Numpy
More autograd
And with loops
Using recursion
Unsupported functions
The syntax a.dot(b) when finding the dot product
Recommended to avoid
Using Autograd with OLS
Same code but now with momentum gradient descent
But noen of these can compete with Newton's method
Including Stochastic Gradient Descent with Autograd
Same code but now with momentum gradient descent
Similar (second order function now) problem but now with AdaGrad
RMSprop for adaptive learning rate with Stochastic Gradient Descent
And finally "ADAM":"https://arxiv.org/pdf/1412.6980.pdf"
And Logistic Regression
Introducing "JAX":"https://jax.readthedocs.io/en/latest/"
Introduction to Neural networks
Artificial neurons
Neural network types
Feed-forward neural networks
Convolutional Neural Network
Recurrent neural networks
Other types of networks
Multilayer perceptrons
Why multilayer perceptrons?
Illustration of a single perceptron model and a multi-perceptron model
Examples of XOR, OR and AND gates
Does Logistic Regression do a better Job?
Adding Neural Networks
Mathematical model
Mathematical model
Mathematical model
Mathematical model
Mathematical model
Matrix-vector notation
Matrix-vector notation and activation
Activation functions
Activation functions, Logistic and Hyperbolic ones
Relevance
Plans for week 40
Work on project 1 and discussions on how to structure your report
No weekly exercises for week 40, project work only
Video on how to write scientific reports recorded during one of the lab sessions
A general guideline can be found at
https://github.com/CompPhysics/MachineLearning/blob/master/doc/Projects/EvaluationGrading/EvaluationForm.md
.
Stochastic Gradient descent with examples and automatic differentiation
Neural Networks, setting up the basic steps, from the simple perceptron model to the multi-layer perceptron model.
Video of lecture
"Whiteboard notes at
https://github.com/CompPhysics/MachineLearning/blob/master/doc/HandWrittenNotes/2023/NotesOct5.pdf
Readings and Videos:
These lecture notes
For a good discussion on gradient methods, we would like to recommend Goodfellow et al section 4.3-4.5 and sections 8.3-8.6. We will come back to the latter chapter in our discussion of Neural networks as well.
Aurelien Geron's chapter 4 on stochastic gradient descent
For neural networks we recommend Goodfellow et al chapter 6.
Video on gradient descent
Video on stochastic gradient descent
Neural Networks demystified
Building Neural Networks from scratch
«
1
2
3
4
5
6
7
8
9
10
11
...
68
»