Random Forest Algorithm
The algorithm described here can be applied to both classification and regression problems.
We will grow of forest of say \( B \) trees.
- For \( b=1:B \)
- Draw a bootstrap sample from the training data organized in our \( \boldsymbol{X} \) matrix.
- We grow then a random forest tree \( T_b \) based on the bootstrapped data by repeating the steps outlined till we reach the maximum node size is reached
- we select \( m \le p \) variables at random from the \( p \) predictors/features
- pick the best split point among the \( m \) features using for example the CART algorithm and create a new node
- split the node into daughter nodes
- Output then the ensemble of trees \( \{T_b\}_1^{B} \) and make predictions for either a regression type of problem or a classification type of problem.