Loading [MathJax]/extensions/TeX/boldsymbol.js
Week 41 Neural networks and constructing a neural network code
Contents
Plan for week 41, October 7-11
Material for the lecture on Monday October 7, 2024
Material for the active learning sessions on Tuesday and Wednesday
Lecture Monday October 7
Introduction to Neural networks
Artificial neurons
Neural network types
Feed-forward neural networks
Convolutional Neural Network
Recurrent neural networks
Other types of networks
Multilayer perceptrons
Why multilayer perceptrons?
Illustration of a single perceptron model and a multi-perceptron model
Examples of XOR, OR and AND gates
Does Logistic Regression do a better Job?
Adding Neural Networks
Mathematics of deep learning
Reminder on books with hands-on material and codes
Reading recommendations
Mathematics of deep learning and neural networks
Basics of an NN
Overarching view of a neural network
The optimization problem
Parameters of neural networks
Other ingredients of a neural network
Other parameters
Universal approximation theorem
Some parallels from real analysis
The approximation theorem in words
More on the general approximation theorem
Class of functions we can approximate
Setting up the equations for a neural network
Layout of a neural network with three hidden layers
Definitions
Inputs to the activation function
Derivatives and the chain rule
Derivative of the cost function
Simpler examples first, and automatic differentiation
Reminder on the chain rule and gradients
Multivariable functions
Automatic differentiation through examples
Simple example
Smarter way of evaluating the above function
Reducing the number of operations
Chain rule, forward and reverse modes
Forward and reverse modes
More complicated function
Counting the number of floating point operations
Defining intermediate operations
New expression for the derivative
Final derivatives
In general not this simple
Automatic differentiation
Chain rule
First network example, simple percepetron with one input
Layout of a simple neural network with no hidden layer
Optimizing the parameters
Adding a hidden layer
Layout of a simple neural network with one hidden layer
The derivatives
Important observations
The training
Code example
Exercise 1: Including more data
Simple neural network and the back propagation equations
Layout of a simple neural network with two input nodes, one hidden layer and one output node
The ouput layer
Compact expressions
Output layer
Explicit derivatives
Derivatives of the hidden layer
Final expression
Completing the list
Final expressions for the biases of the hidden layer
Gradient expressions
Exercise 2: Extended program
Getting serious, the back propagation equations for a neural network
Analyzing the last results
More considerations
Derivatives in terms of
z_j^L
Bringing it together
Final back propagating equation
Using the chain rule and summing over all
k
entries
Setting up the back propagation algorithm
Setting up the back propagation algorithm, part 2
Setting up the Back propagation algorithm, part 3
Updating the gradients
Activation functions
Activation functions, Logistic and Hyperbolic ones
Relevance
Fine-tuning neural network hyperparameters
Hidden layers
Vanishing gradients
Exploding gradients
Is the Logistic activation function (Sigmoid) our choice?
Logistic function as the root of problems
The derivative of the Logistic funtion
Insights from the paper by Glorot and Bengio
The RELU function family
ELU function
Which activation function should we use?
More on activation functions, output layers
Batch Normalization
Dropout
Gradient Clipping
A top-down perspective on Neural networks
More top-down perspectives
Limitations of supervised learning with deep networks
Limitations of NNs
Homogeneous data
More limitations
Lecture Monday October 7
«
1
2
3
4
5
6
7
8
9
10
11
12
13
14
...
113
»