Building up AdaBoost

First, for any \( \beta > 0 \), we optimize \( G \) by setting

$$ G_m(x) = \mathrm{sign} \sum_{i=0}^{n-1} w_i^m I(y_i \ne G_(x_i)), $$

which is the classifier that minimizes the weighted error rate in predicting \( y \).

We can do this by rewriting

$$ \exp{-(\beta)}\sum_{y_i=G(x_i)}w_i^m+\exp{(\beta)}\sum_{y_i\ne G(x_i)}w_i^m, $$

which can be rewritten as

$$ (\exp{(\beta)}-\exp{-(\beta)})\sum_{i=0}^{n-1}w_i^mI(y_i\ne G(x_i))+\exp{(-\beta)}\sum_{i=0}^{n-1}w_i^m=0, $$

which leads to

$$ \beta_m = \frac{1}{2}\log{\frac{1-\mathrm{\overline{err}}}{\mathrm{\overline{err}}}}, $$

where we have redefined the error as

$$ \mathrm{\overline{err}}_m=\frac{1}{n}\frac{\sum_{i=0}^{n-1}w_i^mI(y_i\ne G(x_i)}{\sum_{i=0}^{n-1}w_i^m}, $$

which leads to an update of

$$ f_m(x) = f_{m-1}(x) +\beta_m G_m(x). $$

This leads to the new weights

$$ w_i^{m+1} = w_i^m \exp{(-y_i\beta_m G_m(x_i))} $$