The derivative of the mean-squared error function

Computing \( \partial C(\beta) / \partial \beta_0 \) and \( \partial C(\beta) / \partial \beta_1 \) we can show that the gradient can be written as

$$ \nabla_{\beta} C(\beta) = \frac{2}{n}\begin{bmatrix} \sum_{i=1}^{100} \left(\beta_0+\beta_1x_i-y_i\right) \\ \sum_{i=1}^{100}\left( x_i (\beta_0+\beta_1x_i)-y_ix_i\right) \\ \end{bmatrix} = \frac{2}{n}X^T(X\beta - \mathbf{y}), $$

where \( X \) is the design matrix defined above.