Processing math: 100%

 

 

 

Derivative of the cost function

With these definitions we can now compute the derivative of the cost function in terms of the weights.

Let us specialize to the output layer l=L. Our cost function is

C(ΘL)=12ni=1(yi˜yi)2=12ni=1(aLiyi)2,

The derivative of this function with respect to the weights is

C(ΘL)wLij=(aLjyj)aLjwLij,

The last partial derivative can easily be computed and reads (by applying the chain rule)

aLjwLij=aLjzLjzLjwLij=aLj(1aLj)aL1i.