Week 47: From Decision Trees to Ensemble Methods, Random Forests and Boosting Methods

Loading [MathJax]/extensions/TeX/boldsymbol.js

Classification tree, how to split nodes

If our targets are the outcome of a classification process that takes for example $k=1,2,\dots,K$ values, the only thing we need to think of is to set up the splitting criteria for each node.

We define a PDF $p_{mk}$ that represents the number of observations of a class $k$ in a region $R_m$ with $N_m$ observations. We represent this likelihood function in terms of the proportion $I(y_i=k)$ of observations of this class in the region $R_m$ as

$p_{mk} = \frac{1}{N_m}\sum_{x_i\in R_m}I(y_i=k).$

We let $p_{mk}$ represent the majority class of observations in region $m$ . The three most common ways of splitting a node are given by

Misclassification error

$p_{mk} = \frac{1}{N_m}\sum_{x_i\in R_m}I(y_i\ne k) = 1-p_{mk}.$

Gini index $g$

$g = \sum_{k=1}^K p_{mk}(1-p_{mk}).$

Information entropy or just entropy $s$

$s = -\sum_{k=1}^K p_{mk}\log{p_{mk}}.$