Further Manipulations

Let us special first to the case where we have only two parameters \( \beta_0 \) and \( \beta_1 \). Our result for \( \beta_0 \) simplifies then to

$$ n\beta_0 = \sum_{i=0}^{n-1}y_i - \sum_{i=0}^{n-1} X_{i1} \beta_1. $$

We obtain then

$$ \beta_0 = \frac{1}{n}\sum_{i=0}^{n-1}y_i - \beta_1\frac{1}{n}\sum_{i=0}^{n-1} X_{i1}. $$

If we define

$$ \mu_1=\frac{1}{n}\sum_{i=0}^{n-1} (X_{i1}, $$

and if we define the mean value of the outputs as

$$ \mu_y=\frac{1}{n}\sum_{i=0}^{n-1}y_i, $$

we have

$$ \beta_0 = \mu_y - \beta_1\mu_{1}. $$

In the general case, that is we have more parameters than \( \beta_0 \) and \( \beta_1 \), we have

$$ \beta_0 = \frac{1}{n}\sum_{i=0}^{n-1}y_i - \frac{1}{n}\sum_{i=0}^{n-1}\sum_{j=1}^{p-1} X_{ij}\beta_j. $$

Replacing \( y_i \) with \( y_i - y_i - \overline{\boldsymbol{y}} \) and centering also our design matrix results in a cost function (in vector-matrix disguise)

$$ C(\boldsymbol{\beta}) = (\boldsymbol{\tilde{y}} - \tilde{X}\boldsymbol{\beta})^T(\boldsymbol{\tilde{y}} - \tilde{X}\boldsymbol{\beta}). $$