Week 47: From Decision Trees to Ensemble Methods, Random Forests and Boosting Methods

Loading [MathJax]/extensions/TeX/boldsymbol.js

Adaptive boosting: AdaBoost, Basic Algorithm

The algorithm here is rather straightforward. Assume that our weak classifier is a decision tree and we consider a binary set of outputs with $y_i \in \{-1,1\}$ and $i=0,1,2,\dots,n-1$ as our set of observations. Our design matrix is given in terms of the feature/predictor vectors $\boldsymbol{X}=[\boldsymbol{x}_0\boldsymbol{x}_1\dots\boldsymbol{x}_{p-1}]$ . Finally, we define also a classifier determined by our data via a function $G(x)$ . This function tells us how well we are able to classify our outputs/targets $\boldsymbol{y}$ .

We have already defined the misclassification error $\mathrm{err}$ as

$\mathrm{err}=\frac{1}{n}\sum_{i=0}^{n-1}I(y_i\ne G(x_i)),$

where the function $I()$ is one if we misclassify and zero if we classify correctly.