Week 41 Neural networks and constructing a neural network code

Overarching view of a neural network

The architecture of a neural network defines our model. This model aims at describing some function \( f(\boldsymbol{x} \) which represents some final result (outputs or tagrget values) given a specific inpput \( \boldsymbol{x} \). Note that here \( \boldsymbol{y} \) and \( \boldsymbol{x} \) are not limited to be vectors.

The architecture consists of

An input and an output layer where the input layer is defined by the inputs \( \boldsymbol{x} \). The output layer produces the model ouput \( \boldsymbol{\tilde{y}} \) which is compared with the target value \( \boldsymbol{y} \)
A given number of hidden layers and neurons/nodes/units for each layer (this may vary)
A given activation function \( \sigma(\boldsymbol{z}) \) with arguments \( \boldsymbol{z} \) to be defined below. The activation functions may differ from layer to layer.
The last layer, normally called output layer has normally an activation function tailored to the specific problem
Finally we define a so-called cost or loss function which is used to gauge the quality of our model.