Let us consider a binary classification problem with two outcomes \( y_i \in \{-1,1\} \) and \( i=0,1,2,\dots,n-1 \) as our set of observations. We define a classification function \( G(x) \) which produces a prediction taking one or the other of the two values \( \{-1,1\} \).
The error rate of the training sample is then
$$ \mathrm{\overline{err}}=\frac{1}{n} \sum_{i=0}^{n-1} I(y_i\ne G(x_i)). $$The iterative procedure starts with defining a weak classifier whose error rate is barely better than random guessing. The iterative procedure in boosting is to sequentially apply a weak classification algorithm to repeatedly modified versions of the data producing a sequence of weak classifiers \( G_m(x) \).
Here we will express our function \( f(x) \) in terms of \( G(x) \). That is
$$ f_M(x) = \sum_{i=1}^M \beta_m b(x;\gamma_m), $$will be a function of
$$ G_M(x) = \mathrm{sign} \sum_{i=1}^M \alpha_m G_m(x). $$