The algorithm here is rather straightforward. Assume that our weak classifier is a decision tree and we consider a binary set of outputs with yi∈{−1,1} and i=0,1,2,…,n−1 as our set of observations. Our design matrix is given in terms of the feature/predictor vectors X=[x0x1…xp−1]. Finally, we define also a classifier determined by our data via a function G(x). This function tells us how well we are able to classify our outputs/targets y.
We have already defined the misclassification error err as
err=1nn−1∑i=0I(yi≠G(xi)),where the function I() is one if we misclassify and zero if we classify correctly.