The CART algorithm for regression works is similar to the one for classification except that instead of trying to split the training set in a way that minimizes say the gini or entropy impurity, it now tries to split the training set in a way that minimizes our well-known mean-squared error (MSE). The cost function is now
$$ C(k,t_k) = \frac{m_{\mathrm{left}}}{m}\mathrm{MSE}_{\mathrm{left}}+ \frac{m_{\mathrm{right}}}{m}\mathrm{MSE}_{\mathrm{right}}. $$Here the MSE for a specific node is defined as
$$ \mathrm{MSE}_{\mathrm{node}}=\frac{1}{m_\mathrm{node}}\sum_{i\in \mathrm{node}}(\overline{y}_{\mathrm{node}}-y_i)^2, $$with
$$ \overline{y}_{\mathrm{node}}=\frac{1}{m_\mathrm{node}}\sum_{i\in \mathrm{node}}y_i, $$the mean value of all observations in a specific node.
Without any regularization, the regression task for decision trees, just like for classification tasks, is prone to overfitting.