Setting up the problem
In order to solve the above problem, we define the following Lagrangian function to be minimized
{\cal L}(\lambda,b,\boldsymbol{w})=\frac{1}{2}\boldsymbol{w}^T\boldsymbol{w}-\sum_{i=1}^n\lambda_i\left[y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b)-1\right],
where
\lambda_i is a so-called Lagrange multiplier subject to the condition
\lambda_i \geq 0 .
Taking the derivatives with respect to b and \boldsymbol{w} we obtain
\frac{\partial {\cal L}}{\partial b} = -\sum_{i} \lambda_iy_i=0,
and
\frac{\partial {\cal L}}{\partial \boldsymbol{w}} = 0 = \boldsymbol{w}-\sum_{i} \lambda_iy_i\boldsymbol{x}_i.
Inserting these constraints into the equation for {\cal L} we obtain
{\cal L}=\sum_i\lambda_i-\frac{1}{2}\sum_{ij}^n\lambda_i\lambda_jy_iy_j\boldsymbol{x}_i^T\boldsymbol{x}_j,
subject to the constraints \lambda_i\geq 0 and \sum_i\lambda_iy_i=0 .
We must in addition satisfy the Karush-Kuhn-Tucker (KKT) condition
\lambda_i\left[y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b) -1\right] \hspace{0.1cm}\forall i.
- If \lambda_i > 0 , then y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b)=1 and we say that x_i is on the boundary.
- If y_i(\boldsymbol{w}^T\boldsymbol{x}_i+b)> 1 , we say x_i is not on the boundary and we set \lambda_i=0 .
When
\lambda_i > 0 , the vectors
\boldsymbol{x}_i are called support vectors. They are the vectors closest to the line (or hyperplane) and define the margin
M .