Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

The Ridge case

For Ridge regression we have

\hat{\boldsymbol{\beta}}^{\mathrm{Ridge}}=\left( \boldsymbol{X}^T\boldsymbol{X}+\lambda\boldsymbol{I}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y}.

Inserting the above values we obtain that

\hat{\boldsymbol{\beta}}^{\mathrm{Ridge}}=\begin{bmatrix}\frac{8}{4+\lambda} \\ \frac{2}{1+\lambda}\end{bmatrix},

There is normally a constraint on the value of \vert\vert \boldsymbol{\beta}\vert\vert_2 via the parameter \lambda . Let us for simplicity assume that \beta_0^2+\beta_1^2=1 as constraint. This will allow us to find an expression for the optimal values of \beta and \lambda .

To see this, let us write the cost function for Ridge regression.