Week 37: Statistical interpretations and Resampling Methods

Loading [MathJax]/extensions/TeX/boldsymbol.js

Deriving OLS from a probability distribution

Our basic assumption when we derived the OLS equations was to assume that our output is determined by a given continuous function $f(\boldsymbol{x})$ and a random noise $\boldsymbol{\epsilon}$ given by the normal distribution with zero mean value and an undetermined variance $\sigma^2$ .

We found above that the outputs $\boldsymbol{y}$ have a mean value given by $\boldsymbol{X}\hat{\boldsymbol{\beta}}$ and variance $\sigma^2$ . Since the entries to the design matrix are not stochastic variables, we can assume that the probability distribution of our targets is also a normal distribution but now with mean value $\boldsymbol{X}\hat{\boldsymbol{\beta}}$ . This means that a single output $y_i$ is given by the Gaussian distribution

$y_i\sim \mathcal{N}(\boldsymbol{X}_{i,*}\boldsymbol{\beta}, \sigma^2)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp{\left[-\frac{(y_i-\boldsymbol{X}_{i,*}\boldsymbol{\beta})^2}{2\sigma^2}\right]}.$