Week 41 Neural networks and constructing a neural network code

Contents

Updating the gradients

With the back propagate error for each $ l=L-1,L-2,\dots,1 $ as

$$ \delta_j^l = \sum_k \delta_k^{l+1}w_{kj}^{l+1}sigma'(z_j^l), $$

we update the weights and the biases using gradient descent for each $ l=L-1,L-2,\dots,1 $ and update the weights and biases according to the rules

$$ w_{jk}^l\leftarrow = w_{jk}^l- \eta \delta_j^la_k^{l-1}, $$ $$ b_j^l \leftarrow b_j^l-\eta \frac{\partial {\cal C}}{\partial b_j^l}=b_j^l-\eta \delta_j^l, $$