Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Overarching view of a neural network

The architecture of a neural network defines our model. This model aims at describing some function f(\boldsymbol{x} which represents some final result (outputs or tagrget values) given a specific inpput \boldsymbol{x} . Note that here \boldsymbol{y} and \boldsymbol{x} are not limited to be vectors.

The architecture consists of

  1. An input and an output layer where the input layer is defined by the inputs \boldsymbol{x} . The output layer produces the model ouput \boldsymbol{\tilde{y}} which is compared with the target value \boldsymbol{y}
  2. A given number of hidden layers and neurons/nodes/units for each layer (this may vary)
  3. A given activation function \sigma(\boldsymbol{z}) with arguments \boldsymbol{z} to be defined below. The activation functions may differ from layer to layer.
  4. The last layer, normally called output layer has normally an activation function tailored to the specific problem
  5. Finally we define a so-called cost or loss function which is used to gauge the quality of our model.