Week 47: From Decision Trees to Ensemble Methods, Random Forests and Boosting Methods

Loading [MathJax]/extensions/TeX/boldsymbol.js

Random Forest Algorithm

The algorithm described here can be applied to both classification and regression problems.

We will grow of forest of say $B$ trees.

For $b=1:B$

Draw a bootstrap sample from the training data organized in our $\boldsymbol{X}$ matrix.
We grow then a random forest tree $T_b$ based on the bootstrapped data by repeating the steps outlined till we reach the maximum node size is reached

we select $m \le p$ variables at random from the $p$ predictors/features
pick the best split point among the $m$ features using for example the CART algorithm and create a new node
split the node into daughter nodes

Output then the ensemble of trees $\{T_b\}_1^{B}$ and make predictions for either a regression type of problem or a classification type of problem.

«
1
...
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
...
66
»