The tuning parameter \alpha controls a trade-off between the subtree’s com- plexity and its fit to the training data. When \alpha = 0 , then the subtree T will simply equal T_0 , because then the above equation just measures the training error. However, as \alpha increases, there is a price to pay for having a tree with many terminal nodes. The above equation will tend to be minimized for a smaller subtree.
It turns out that as we increase \alpha from zero branches get pruned from the tree in a nested and predictable fashion, so obtaining the whole sequence of subtrees as a function of \alpha is easy. We can select a value of \alpha using a validation set or using cross-validation. We then return to the full data set and obtain the subtree corresponding to \alpha .