Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Deriving OLS from a probability distribution

Our basic assumption when we derived the OLS equations was to assume that our output is determined by a given continuous function f(\boldsymbol{x}) and a random noise \boldsymbol{\epsilon} given by the normal distribution with zero mean value and an undetermined variance \sigma^2 .

We found above that the outputs \boldsymbol{y} have a mean value given by \boldsymbol{X}\hat{\boldsymbol{\beta}} and variance \sigma^2 . Since the entries to the design matrix are not stochastic variables, we can assume that the probability distribution of our targets is also a normal distribution but now with mean value \boldsymbol{X}\hat{\boldsymbol{\beta}} . This means that a single output y_i is given by the Gaussian distribution

y_i\sim \mathcal{N}(\boldsymbol{X}_{i,*}\boldsymbol{\beta}, \sigma^2)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp{\left[-\frac{(y_i-\boldsymbol{X}_{i,*}\boldsymbol{\beta})^2}{2\sigma^2}\right]}.