A neural network consists of a series of hidden layers, in addition to the input and output layers. Each layer \( l \) has a set of parameters \( \boldsymbol{\Theta}^{(l)}=(\boldsymbol{W}^{(l)},\boldsymbol{b}^{(l)}) \) which are related to the parameters in other layers through a series of affine transformations, for a standard NN these are matrix-matrix and matrix-vector multiplications. For all layers we will simply use a collective variable \( \boldsymbol{\Theta} \).
It consist of two basic steps:
These two steps make up one iteration. This iterative process is continued till we reach an eventual stopping criterion.