Week 42 Constructing a Neural Network code with examples

Loading [MathJax]/extensions/TeX/boldsymbol.js

More on activation functions, output layers

In most cases you can use the ReLU activation function in the hidden layers (or one of its variants).

It is a bit faster to compute than other activation functions, and the gradient descent optimization does in general not get stuck.

For the output layer:

For classification the softmax activation function is generally a good choice for classification tasks (when the classes are mutually exclusive).
For regression tasks, you can simply use no activation function at all.