We have the ouput layer given by layer label \( (2) \) with output \( a^{(2)} \) and weights and biases to be determined given by the variables
$$ w_{i}^{(2)}=\left\{w_{1}^{(2)},w_{2}^{(2)}\right\} \wedge b^{(2)}. $$Our output is \( \tilde{y}=a^{(2)} \) and we define a generic cost function \( C(a^{(2)},y;\boldsymbol{\Theta}) \) where \( y \) is the target value (a scalar here). The parameters we need to optimize are given by
$$ \boldsymbol{\Theta}=\left\{w_{11}^{(1)},w_{12}^{(1)},w_{21}^{(1)},w_{22}^{(1)},w_{1}^{(2)},w_{2}^{(2)},b_1^{(1)},b_2^{(1)},b^{(2)}\right\}. $$