Let us special first to the case where we have only two parameters \beta_0 and \beta_1 . Our result for \beta_0 simplifies then to
n\beta_0 = \sum_{i=0}^{n-1}y_i - \sum_{i=0}^{n-1} X_{i1} \beta_1.We obtain then
\beta_0 = \frac{1}{n}\sum_{i=0}^{n-1}y_i - \beta_1\frac{1}{n}\sum_{i=0}^{n-1} X_{i1}.If we define
\mu_1=\frac{1}{n}\sum_{i=0}^{n-1} (X_{i1},and if we define the mean value of the outputs as
\mu_y=\frac{1}{n}\sum_{i=0}^{n-1}y_i,we have
\beta_0 = \mu_y - \beta_1\mu_{1}.In the general case, that is we have more parameters than \beta_0 and \beta_1 , we have
\beta_0 = \frac{1}{n}\sum_{i=0}^{n-1}y_i - \frac{1}{n}\sum_{i=0}^{n-1}\sum_{j=1}^{p-1} X_{ij}\beta_j.Replacing y_i with y_i - y_i - \overline{\boldsymbol{y}} and centering also our design matrix results in a cost function (in vector-matrix disguise)
C(\boldsymbol{\beta}) = (\boldsymbol{\tilde{y}} - \tilde{X}\boldsymbol{\beta})^T(\boldsymbol{\tilde{y}} - \tilde{X}\boldsymbol{\beta}).