Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Setting up the Back propagation algorithm, part 3

Finally, we update the weights and the biases using gradient descent for each l=L-1,L-2,\dots,1 (the first hidden layer) and update the weights and biases according to the rules

w_{ij}^l\leftarrow = w_{ij}^l- \eta \delta_j^la_i^{l-1}, b_j^l \leftarrow b_j^l-\eta \frac{\partial {\cal C}}{\partial b_j^l}=b_j^l-\eta \delta_j^l,

with \eta being the learning rate.