Week 41 Neural networks and constructing a neural network code

Contents

Forward and reverse modes

We have that

$$ \frac{df}{dx}=\frac{df}{db}\frac{db}{da}\frac{da}{dx}, $$

which we can rewrite either as

$$ \frac{df}{dx}=\left[\frac{df}{db}\frac{db}{da}\right]\frac{da}{dx}, $$

$$ \frac{df}{dx}=\frac{df}{db}\left[\frac{db}{da}\frac{da}{dx}\right]. $$

The first expression is called reverse mode (or back propagation) since we start by evaluating the derivatives at the end point and then propagate backwards. This is the standard way of evaluating derivatives (gradients) when optimizing the parameters of a neural network. In the context of deep learning this is computationally more efficient since the output of a neural network consists of either one or some few other output variables.

The second equation defines the so-called forward mode.