Processing math: 100%

 

 

 

Updating the gradients

With the back propagate error for each l=L1,L2,,1 as

δlj=kδl+1kwl+1kjsigma(zlj),

we update the weights and the biases using gradient descent for each l=L1,L2,,1 and update the weights and biases according to the rules

wljk←=wljkηδljal1k, bljbljηCblj=bljηδlj,