Loading [MathJax]/extensions/TeX/boldsymbol.js

 

 

 

Squared-Error Example and Iterative Fitting

To better understand what happens, let us develop the steps for the iterative fitting using the above squared error function.

For simplicity we assume also that our functions b(x;\gamma)=1+\gamma x .

This means that for every iteration m , we need to optimize

(\beta_m,\gamma_m) = \mathrm{argmin}_{\beta,\lambda}\hspace{0.1cm} \sum_{i=0}^{n-1}(y_i-f_{m-1}(x_i)-\beta b(x;\gamma))^2=\sum_{i=0}^{n-1}(y_i-f_{m-1}(x_i)-\beta(1+\gamma x_i))^2.

We start our iteration by simply setting f_0(x)=0 . Taking the derivatives with respect to \beta and \gamma we obtain

\frac{\partial {\cal C}}{\partial \beta} = -2\sum_{i}(1+\gamma x_i)(y_i-\beta(1+\gamma x_i))=0,

and

\frac{\partial {\cal C}}{\partial \gamma} =-2\sum_{i}\beta x_i(y_i-\beta(1+\gamma x_i))=0.

We can then rewrite these equations as (defining \boldsymbol{w}=\boldsymbol{e}+\gamma \boldsymbol{x}) with \boldsymbol{e} being the unit vector)

\gamma \boldsymbol{w}^T(\boldsymbol{y}-\beta\gamma \boldsymbol{w})=0,

which gives us \beta = \boldsymbol{w}^T\boldsymbol{y}/(\boldsymbol{w}^T\boldsymbol{w}) . Similarly we have

\beta\gamma \boldsymbol{x}^T(\boldsymbol{y}-\beta(1+\gamma \boldsymbol{x}))=0,

which leads to \gamma =(\boldsymbol{x}^T\boldsymbol{y}-\beta\boldsymbol{x}^T\boldsymbol{e})/(\beta\boldsymbol{x}^T\boldsymbol{x}) . Inserting for \beta gives us an equation for \gamma . This is a non-linear equation in the unknown \gamma and has to be solved numerically.

The solution to these two equations gives us in turn \beta_1 and \gamma_1 leading to the new expression for f_1(x) as f_1(x) = \beta_1(1+\gamma_1x) . Doing this M times results in our final estimate for the function f .