How to set up the cross-validation for Ridge and/or Lasso
- Define a range of interest for the penalty parameter.
- Divide the data set into training and test set comprising samples \( \{1, \ldots, n\} \setminus i \) and \( \{ i \} \), respectively.
- Fit the linear regression model by means of for example Ridge or Lasso regression for each \( \lambda \) in the grid using the training set, and the corresponding estimate of the error variance \( \boldsymbol{\sigma}_{-i}^2(\lambda) \), as
$$
\begin{align*}
\boldsymbol{\beta}_{-i}(\lambda) & = ( \boldsymbol{X}_{-i, \ast}^{T}
\boldsymbol{X}_{-i, \ast} + \lambda \boldsymbol{I}_{pp})^{-1}
\boldsymbol{X}_{-i, \ast}^{T} \boldsymbol{y}_{-i}
\end{align*}
$$
- Evaluate the prediction performance of these models on the test set by \( C[y_i, \boldsymbol{X}_{i, \ast}; \boldsymbol{\beta}_{-i}(\lambda), \boldsymbol{\sigma}_{-i}^2(\lambda)] \). Or, by the prediction error \( |y_i - \boldsymbol{X}_{i, \ast} \boldsymbol{\beta}_{-i}(\lambda)| \), the relative error, the error squared or the R2 score function.
- Repeat the first three steps such that each sample plays the role of the test set once.
- Average the prediction performances of the test sets at each grid point of the penalty bias/parameter. It is an estimate of the prediction performance of the model corresponding to this value of the penalty parameter on novel data.