Setting up the problem
In order to solve the above problem, we define the following Lagrangian function to be minimized
$$
{\cal L}(\lambda,b,\boldsymbol{w})=\frac{1}{2}\boldsymbol{w}^T\boldsymbol{w}-\sum_{i=1}^n\lambda_i\left[y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b)-1\right],
$$
where \( \lambda_i \) is a so-called Lagrange multiplier subject to the condition \( \lambda_i \geq 0 \).
Taking the derivatives with respect to \( b \) and \( \boldsymbol{w} \) we obtain
$$
\frac{\partial {\cal L}}{\partial b} = -\sum_{i} \lambda_iy_i=0,
$$
and
$$
\frac{\partial {\cal L}}{\partial \boldsymbol{w}} = 0 = \boldsymbol{w}-\sum_{i} \lambda_iy_i\boldsymbol{x}_i.
$$
Inserting these constraints into the equation for \( {\cal L} \) we obtain
$$
{\cal L}=\sum_i\lambda_i-\frac{1}{2}\sum_{ij}^n\lambda_i\lambda_jy_iy_j\boldsymbol{x}_i^T\boldsymbol{x}_j,
$$
subject to the constraints \( \lambda_i\geq 0 \) and \( \sum_i\lambda_iy_i=0 \).
We must in addition satisfy the Karush-Kuhn-Tucker (KKT) condition
$$
\lambda_i\left[y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b) -1\right] \hspace{0.1cm}\forall i.
$$
- If \( \lambda_i > 0 \), then \( y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b)=1 \) and we say that \( x_i \) is on the boundary.
- If \( y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b)> 1 \), we say \( x_i \) is not on the boundary and we set \( \lambda_i=0 \).
When \( \lambda_i > 0 \), the vectors \( \boldsymbol{x}_i \) are called support vectors. They are the vectors closest to the line (or hyperplane) and define the margin \( M \).