Computing a Tree using the Gini Index
Consider the following example with attributes/features and two
possible outcomes (classes) for each attribute. Assume we wish to find some
correlations between the average grade of a student as function of the
number of hours studied and hours slept. We want also to correlate the
grade in a given course with the general trend, whether the students
recently has gotten grades below average or above.
We have three features/attributes
- Trend of average grades before present course, classified as either below or above the average grade of the whole class
- The number of hours studies, classified again as either higher (more than 3 hours per day) or lower . Here we have used a standard for one \( ECTS \) which is scaled to 25-30 hours of work for a semester which lasts 18 weeks, with 15 weeks of lectures and 3 weeks for exams, assuming a total of 30 ECTS per semester.
- The number of hours slept as high for more than \( 8 \) hours and below for less than 8 hours of sleep, classified again as either high or low
- The final grade whether it is above or below average