Week 36: Linear Regression and Statistical interpretations

Processing math: 100%

Assumptions made

The assumption we have made here can be summarized as (and this is going to be useful when we discuss the bias-variance trade off) that there exists a function $f(\boldsymbol{x})$ and a normal distributed error $\boldsymbol{\varepsilon}\sim \mathcal{N}(0, \sigma^2)$ which describe our data

$\boldsymbol{y} = f(\boldsymbol{x})+\boldsymbol{\varepsilon}$

We approximate this function with our model from the solution of the linear regression equations, that is our function $f$ is approximated by $\boldsymbol{\tilde{y}}$ where we want to minimize $(\boldsymbol{y}-\boldsymbol{\tilde{y}})^2$ , our MSE, with

$\boldsymbol{\tilde{y}} = \boldsymbol{X}\boldsymbol{\beta}.$