Week 42 Constructing a Neural Network code with examples

Contents

The training

The training of the parameters is done through various gradient descent approximations with

$$ w_{i}\leftarrow w_{i}- \eta \delta_i a_{i-1}, $$

and

$$ b_i \leftarrow b_i-\eta \delta_i, $$

with $ \eta $ is the learning rate.

One iteration consists of one feed forward step and one back-propagation step. Each back-propagation step does one update of the parameters $ \boldsymbol{\Theta} $.

For the first hidden layer $ a_{i-1}=a_0=x $ for this simple model.