Week 42 Constructing a Neural Network code with examples

Processing math: 100%

Contents

Using the chain rule and summing over all $k$ entries

We obtain

$\delta_j^l =\sum_k \frac{\partial {\cal C}}{\partial z_k^{l+1}}\frac{\partial z_k^{l+1}}{\partial z_j^{l}}=\sum_k \delta_k^{l+1}\frac{\partial z_k^{l+1}}{\partial z_j^{l}},$

and recalling that

$z_j^{l+1} = \sum_{i=1}^{M_{l}}w_{ij}^{l+1}a_i^{l}+b_j^{l+1},$

with $M_l$ being the number of nodes in layer $l$ , we obtain

$\delta_j^l =\sum_k \delta_k^{l+1}w_{kj}^{l+1}\sigma'(z_j^l),$

This is our final equation.

We are now ready to set up the algorithm for back propagation and learning the weights and biases.

Using the chain rule and summing over all k k entries

Using the chain rule and summing over all $k$ entries