Week 35: From Ordinary Linear Regression to Ridge and Lasso Regression

Loading [MathJax]/extensions/TeX/boldsymbol.js

Further Manipulations

Let us special first to the case where we have only two parameters $\beta_0$ and $\beta_1$ . Our result for $\beta_0$ simplifies then to

$n\beta_0 = \sum_{i=0}^{n-1}y_i - \sum_{i=0}^{n-1} X_{i1} \beta_1.$

We obtain then

$\beta_0 = \frac{1}{n}\sum_{i=0}^{n-1}y_i - \beta_1\frac{1}{n}\sum_{i=0}^{n-1} X_{i1}.$

If we define

$\mu_1=\frac{1}{n}\sum_{i=0}^{n-1} (X_{i1},$

and if we define the mean value of the outputs as

$\mu_y=\frac{1}{n}\sum_{i=0}^{n-1}y_i,$

we have

$\beta_0 = \mu_y - \beta_1\mu_{1}.$

In the general case, that is we have more parameters than $\beta_0$ and $\beta_1$ , we have

$\beta_0 = \frac{1}{n}\sum_{i=0}^{n-1}y_i - \frac{1}{n}\sum_{i=0}^{n-1}\sum_{j=1}^{p-1} X_{ij}\beta_j.$

Replacing $y_i$ with $y_i - y_i - \overline{\boldsymbol{y}}$ and centering also our design matrix results in a cost function (in vector-matrix disguise)

$C(\boldsymbol{\beta}) = (\boldsymbol{\tilde{y}} - \tilde{X}\boldsymbol{\beta})^T(\boldsymbol{\tilde{y}} - \tilde{X}\boldsymbol{\beta}).$