Week 36: Linear Regression and Statistical interpretations

Loading [MathJax]/extensions/TeX/boldsymbol.js

Comparison with OLS

When we compare this with the ordinary least squares result we have

$\hat{\boldsymbol{\beta}}_{\mathrm{OLS}} = \left(\boldsymbol{X}^T\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\boldsymbol{y},$

which can lead to singular matrices. However, with the SVD, we can always compute the inverse of the matrix $\boldsymbol{X}^T\boldsymbol{X}$ .

We see that Ridge regression is nothing but the standard OLS with a modified diagonal term added to $\boldsymbol{X}^T\boldsymbol{X}$ . The consequences, in particular for our discussion of the bias-variance tradeoff are rather interesting. We will see that for specific values of $\lambda$ , we may even reduce the variance of the optimal parameters $\boldsymbol{\beta}$ . These topics and other related ones, will be discussed after the more linear algebra oriented analysis here.