Week 46: Decision Trees, Ensemble methods and Random Forests
Contents
Plan for week 46
Decision trees, overarching aims
Basics of a tree
A typical Decision Tree with its pertinent Jargon, Classification Problem
General Features
How do we set it up?
Decision trees and Regression
Building a tree, regression
A top-down approach, recursive binary splitting
Making a tree
Pruning the tree
Cost complexity pruning
Schematic Regression Procedure
A Classification Tree
Growing a classification tree
Classification tree, how to split nodes
Visualizing the Tree, Classification
Visualizing the Tree, The Moons
Other ways of visualizing the trees
Printing out as text
Algorithms for Setting up Decision Trees
The CART algorithm for Classification
The CART algorithm for Regression
Why binary splits?
Computing a Tree using the Gini Index
The Table
Computing the various Gini Indices
Computing the various Gini Indices, Hours slept
Computing the various Gini Indices, Hours studied
A possible code using Scikit-Learn
Further example: Computing the Gini index
Simple Python Code to read in Data and perform Classification
Computing the Gini Factor
Regression trees
Final regressor code
Pros and cons of trees, pros
Disadvantages
Ensemble Methods: From a Single Tree to Many Trees and Extreme Boosting, Meet the Jungle of Methods
An Overview of Ensemble Methods
Why Voting?
Tossing coins
Standard imports first
Simple Voting Example, head or tail
Using the Voting Classifier
Voting and Bagging
Bagging
More bagging
Making your own Bootstrap: Changing the Level of the Decision Tree
Random forests
Random Forest Algorithm
Random Forests Compared with other Methods on the Cancer Data
Compare Bagging on Trees with Random Forests
Boosting, a Bird's Eye View
What is boosting? Additive Modelling/Iterative Fitting
Iterative Fitting, Regression and Squared-error Cost Function
Squared-Error Example and Iterative Fitting
Iterative Fitting, Classification and AdaBoost
Adaptive Boosting, AdaBoost
Building up AdaBoost
Adaptive boosting: AdaBoost, Basic Algorithm
Basic Steps of AdaBoost
AdaBoost Examples
Computing the various Gini Indices, Hours slept
See handwritten notes November 3
«
1
...
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
...
63
»