Processing math: 100%

 

 

 

Assumptions made

The assumption we have made here can be summarized as (and this is going to be useful when we discuss the bias-variance trade off) that there exists a function f(x) and a normal distributed error εN(0,σ2) which describe our data

y=f(x)+ε

We approximate this function with our model from the solution of the linear regression equations, that is our function f is approximated by ˜y where we want to minimize (y˜y)2, our MSE, with

˜y=Xβ.