Setting up the Matrix to be inverted

The matrix that may cause problems for us is \( \boldsymbol{X}^T\boldsymbol{X} \). Using the SVD we can rewrite this matrix as

$$ \boldsymbol{X}^T\boldsymbol{X}=\boldsymbol{V}\boldsymbol{\Sigma}^T\boldsymbol{U}^T\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^T, $$

and using the orthogonality of the matrix \( \boldsymbol{U} \) we have

$$ \boldsymbol{X}^T\boldsymbol{X}=\boldsymbol{V}\boldsymbol{\Sigma}^T\boldsymbol{\Sigma}\boldsymbol{V}^T. $$

We define \( \boldsymbol{\Sigma}^T\boldsymbol{\Sigma}=\tilde{\boldsymbol{\Sigma}}^2 \) which is a diagonal matrix containing only the singular values squared. It has dimensionality \( p \times p \).

We can now insert the result for the matrix \( \boldsymbol{X}^T\boldsymbol{X} \) into our equation for ordinary least squares where

$$ \tilde{y}_{\mathrm{OLS}}=\boldsymbol{X}\left(\boldsymbol{X}^T\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y}, $$

and using our SVD decomposition of \( \boldsymbol{X} \) we have

$$ \tilde{y}_{\mathrm{OLS}}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^T\left(\boldsymbol{V}\tilde{\boldsymbol{\Sigma}}^{2}(\boldsymbol{V}^T\right)^{-1}\boldsymbol{V}\boldsymbol{\Sigma}^T\boldsymbol{U}^T\boldsymbol{y}, $$

which gives us, using the orthogonality of the matrix \( \boldsymbol{V} \),

$$ \tilde{y}_{\mathrm{OLS}}=\boldsymbol{U}\boldsymbol{U}^T\boldsymbol{y}=\sum_{i=0}^{p-1}\boldsymbol{u}_i\boldsymbol{u}^T_i\boldsymbol{y}, $$

It means that the ordinary least square model (with the optimal parameters) \( \boldsymbol{\tilde{y}} \), corresponds to an orthogonal transformation of the output (or target) vector \( \boldsymbol{y} \) by the vectors of the matrix \( \boldsymbol{U} \). Note that the summation ends at \( p-1 \), that is \( \boldsymbol{\tilde{y}}\ne \boldsymbol{y} \). We can thus not use the orthogonality relation for the matrix \( \boldsymbol{U} \). This can already be when we multiply the matrices \( \boldsymbol{\Sigma}^T\boldsymbol{U}^T \).