Data Analysis and Machine Learning: Support Vector Machines

Processing math: 100%

Back to the more realistic cases

We are now ready to return to our setup of the optmization problem for a more realistic case. Introducint the slack parameter $C$ we have $\frac{1}{2} \boldsymbol{\lambda}^T\begin{bmatrix} y_1y_1K(\boldsymbol{x}_1,\boldsymbol{x}_1) & y_1y_2K(\boldsymbol{x}_1,\boldsymbol{x}_2) & \dots & \dots & y_1y_nK(\boldsymbol{x}_1,\boldsymbol{x}_n) \\ y_2y_1K(\boldsymbol{x}_2,\boldsymbol{x}_1) & y_2y_2K(\boldsymbol{x}_2,\boldsymbol{x}_2) & \dots & \dots & y_1y_nK(\boldsymbol{x}_2,\boldsymbol{x}_n) \\ \dots & \dots & \dots & \dots & \dots \\ \dots & \dots & \dots & \dots & \dots \\ y_ny_1K(\boldsymbol{x}_n,\boldsymbol{x}_1) & y_ny_2K(\boldsymbol{x}_n\boldsymbol{x}_2) & \dots & \dots & y_ny_nK(\boldsymbol{x}_n,\boldsymbol{x}_n) \\ \end{bmatrix}\boldsymbol{\lambda}-\mathbb{I}\boldsymbol{\lambda},$ subject to $\boldsymbol{y}^T\boldsymbol{\lambda}=0$ . Here we defined the vectors $\boldsymbol{\lambda} =[\lambda_1,\lambda_2,\dots,\lambda_n]$ and $\boldsymbol{y}=[y_1,y_2,\dots,y_n]$ . With the slack constants this leads to the additional constraint $0\leq \lambda_i \leq C$ .

code will be added