Week 35: From Ordinary Linear Regression to Ridge and Lasso Regression

Setting up the Matrix to be inverted

The matrix that may cause problems for us is $ \boldsymbol{X}^T\boldsymbol{X} $. Using the SVD we can rewrite this matrix as

$$ \boldsymbol{X}^T\boldsymbol{X}=\boldsymbol{V}\boldsymbol{\Sigma}^T\boldsymbol{U}^T\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^T, $$

and using the orthogonality of the matrix $ \boldsymbol{U} $ we have

$$ \boldsymbol{X}^T\boldsymbol{X}=\boldsymbol{V}\boldsymbol{\Sigma}^T\boldsymbol{\Sigma}\boldsymbol{V}^T. $$

We define $ \boldsymbol{\Sigma}^T\boldsymbol{\Sigma}=\tilde{\boldsymbol{\Sigma}}^2 $ which is a diagonal matrix containing only the singular values squared. It has dimensionality $ p \times p $.

We can now insert the result for the matrix $ \boldsymbol{X}^T\boldsymbol{X} $ into our equation for ordinary least squares where

$$ \tilde{y}_{\mathrm{OLS}}=\boldsymbol{X}\left(\boldsymbol{X}^T\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y}, $$

and using our SVD decomposition of $ \boldsymbol{X} $ we have

$$ \tilde{y}_{\mathrm{OLS}}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^T\left(\boldsymbol{V}\tilde{\boldsymbol{\Sigma}}^{2}(\boldsymbol{V}^T\right)^{-1}\boldsymbol{V}\boldsymbol{\Sigma}^T\boldsymbol{U}^T\boldsymbol{y}, $$

which gives us, using the orthogonality of the matrix $ \boldsymbol{V} $,

$$ \tilde{y}_{\mathrm{OLS}}=\sum_{i=0}^{p-1}\boldsymbol{u}_i\boldsymbol{u}^T_i\boldsymbol{y}, $$

which is not the same as $ \tilde{y}_{\mathrm{OLS}}=\boldsymbol{U}\boldsymbol{U}^T\boldsymbol{y} $, which due to the orthogonality of $ \boldsymbol{U} $ would have given us that the model equals the output.

It means that the ordinary least square model (with the optimal parameters) $ \boldsymbol{\tilde{y}} $, corresponds to an orthogonal transformation of the output (or target) vector $ \boldsymbol{y} $ by the vectors of the matrix $ \boldsymbol{U} $. Note that the summation ends at $ p-1 $, that is $ \boldsymbol{\tilde{y}}\ne \boldsymbol{y} $. We can thus not use the orthogonality relation for the matrix $ \boldsymbol{U} $.