Week 42 Constructing a Neural Network code with examples

Contents

Analyzing the last results

This is an important expression. The second term on the right handside measures how fast the cost function is changing as a function of the $j$th output activation. If, for example, the cost function doesn't depend much on a particular output node $ j $, then $ \delta_j^L $ will be small, which is what we would expect. The first term on the right, measures how fast the activation function $ f $ is changing at a given activation value $ z_j^L $.