Let us consider a binary classification problem with two outcomes y_i \in \{-1,1\} and i=0,1,2,\dots,n-1 as our set of observations. We define a classification function G(x) which produces a prediction taking one or the other of the two values \{-1,1\} .
The error rate of the training sample is then
\mathrm{\overline{err}}=\frac{1}{n} \sum_{i=0}^{n-1} I(y_i\ne G(x_i)).The iterative procedure starts with defining a weak classifier whose error rate is barely better than random guessing. The iterative procedure in boosting is to sequentially apply a weak classification algorithm to repeatedly modified versions of the data producing a sequence of weak classifiers G_m(x) .
Here we will express our function f(x) in terms of G(x) . That is
f_M(x) = \sum_{i=1}^M \beta_m b(x;\gamma_m),will be a function of
G_M(x) = \mathrm{sign} \sum_{i=1}^M \alpha_m G_m(x).