Overarching view of a neural network
The architecture of a neural network defines our model. This model
aims at describing some function f(\boldsymbol{x} which represents
some final result (outputs or tagrget values) given a specific inpput
\boldsymbol{x} . Note that here \boldsymbol{y} and \boldsymbol{x} are not limited to be
vectors.
The architecture consists of
- An input and an output layer where the input layer is defined by the inputs \boldsymbol{x} . The output layer produces the model ouput \boldsymbol{\tilde{y}} which is compared with the target value \boldsymbol{y}
- A given number of hidden layers and neurons/nodes/units for each layer (this may vary)
- A given activation function \sigma(\boldsymbol{z}) with arguments \boldsymbol{z} to be defined below. The activation functions may differ from layer to layer.
- The last layer, normally called output layer has normally an activation function tailored to the specific problem
- Finally we define a so-called cost or loss function which is used to gauge the quality of our model.