Week 47: From Decision Trees to Ensemble Methods, Random Forests and Boosting Methods

Loading [MathJax]/extensions/TeX/boldsymbol.js

Computing a Tree using the Gini Index

Consider the following example with attributes/features and two possible outcomes (classes) for each attribute. Assume we wish to find some correlations between the average grade of a student as function of the number of hours studied and hours slept. We want also to correlate the grade in a given course with the general trend, whether the students recently has gotten grades below average or above.

We have three features/attributes

Trend of average grades before present course, classified as either below or above the average grade of the whole class
The number of hours studies, classified again as either higher (more than 3 hours per day) or lower . Here we have used a standard for one $ECTS$ which is scaled to 25-30 hours of work for a semester which lasts 18 weeks, with 15 weeks of lectures and 3 weeks for exams, assuming a total of 30 ECTS per semester.
The number of hours slept as high for more than $8$ hours and below for less than 8 hours of sleep, classified again as either high or low
The final grade whether it is above or below average