Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Setting up the back propagation algorithm

The four equations provide us with a way of computing the gradient of the cost function. Let us write this out in the form of an algorithm.

First, we set up the input data \hat{x} and the activations \hat{z}_1 of the input layer and compute the activation function and the pertinent outputs \hat{a}^1 .

Secondly, we perform then the feed forward till we reach the output layer and compute all \hat{z}_l of the input layer and compute the activation function and the pertinent outputs \hat{a}^l for l=1,2,3,\dots,L .

Notation: The first hidden layer has l=1 as label and the final output layer has l=L .