Processing math: 100%

 

 

 

Gradient expressions

For this specific model, with just one output node and two hidden nodes, the gradient descent equations take the following form for output layer

w(2)iw(2)iηδ(2)a(1)i,

and

b(2)b(2)ηδ(2),

and

w(1)ijw(1)ijηδ(1)ia(0)j,

and

b(1)ib(1)iηδ(1)i,

where η is the learning rate.