The second requirement excludes all linear functions. Furthermore, in a MLP with only linear activation functions, each layer simply performs a linear transformation of its inputs.
Regardless of the number of layers, the output of the NN will be nothing but a linear function of the inputs. Thus we need to introduce some kind of non-linearity to the NN to be able to fit non-linear functions Typical examples are the logistic Sigmoid
$$ \sigma(x) = \frac{1}{1 + e^{-x}}, $$and the hyperbolic tangent function
$$ \sigma(x) = \tanh(x) $$