Setting up the problem
In order to solve the above problem, we define the following Lagrangian function to be minimized
L(λ,b,w)=12wTw−n∑i=1λi[yi(wTxi+b)−1],
where
λi is a so-called Lagrange multiplier subject to the condition
λi≥0.
Taking the derivatives with respect to b and w we obtain
∂L∂b=−∑iλiyi=0,
and
∂L∂w=0=w−∑iλiyixi.
Inserting these constraints into the equation for L we obtain
L=∑iλi−12n∑ijλiλjyiyjxTixj,
subject to the constraints λi≥0 and ∑iλiyi=0.
We must in addition satisfy the Karush-Kuhn-Tucker (KKT) condition
λi[yi(wTxi+b)−1]∀i.
- If λi>0, then yi(wTxi+b)=1 and we say that xi is on the boundary.
- If yi(wTxi+b)>1, we say xi is not on the boundary and we set λi=0.
When
λi>0, the vectors
xi are called support vectors. They are the vectors closest to the line (or hyperplane) and define the margin
M.