We consider the case where the dependent variables, also called the responses or the outcomes, y_i are discrete and only take values from k=0,\dots,K-1 (i.e. K classes).
The goal is to predict the output classes from the design matrix \boldsymbol{X}\in\mathbb{R}^{n\times p} made of n samples, each of which carries p features or predictors. The primary goal is to identify the classes to which new unseen samples belong.
Let us specialize to the case of two classes only, with outputs y_i=0 and y_i=1 . Our outcomes could represent the status of a credit card user that could default or not on her/his credit card debt. That is
y_i = \begin{bmatrix} 0 & \mathrm{no}\\ 1 & \mathrm{yes} \end{bmatrix}.