Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

A new Cost Function

We could now define a new cost function to minimize, namely the negative logarithm of the above PDF

C(\boldsymbol{\beta}=-\log{\prod_{i=0}^{n-1}p(y_i,\boldsymbol{X}\vert\boldsymbol{\beta})}=-\sum_{i=0}^{n-1}\log{p(y_i,\boldsymbol{X}\vert\boldsymbol{\beta})},

which becomes

C(\boldsymbol{\beta}=\frac{n}{2}\log{2\pi\sigma^2}+\frac{\vert\vert (\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta})\vert\vert_2^2}{2\sigma^2}.

Taking the derivative of the new cost function with respect to the parameters \beta we recognize our familiar OLS equation, namely

\boldsymbol{X}^T\left(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta}\right) =0,

which leads to the well-known OLS equation for the optimal paramters \beta

\hat{\boldsymbol{\beta}}^{\mathrm{OLS}}=\left(\boldsymbol{X}^T\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y}!

Before we make a similar analysis for Ridge and Lasso regression, we need a short reminder on statistics.