The problem to solve

Using our definition of the kernel We can rewrite again the Lagrangian $$ {\cal L}=\sum_i\lambda_i-\frac{1}{2}\sum_{ij}^n\lambda_i\lambda_jy_iy_j\boldsymbol{x}_i^T\boldsymbol{z}_j, $$ subject to the constraints \( \lambda_i\geq 0 \), \( \sum_i\lambda_iy_i=0 \) in terms of a convex optimization problem $$ \frac{1}{2} \boldsymbol{\lambda}^T\begin{bmatrix} y_1y_1K(\boldsymbol{x}_1,\boldsymbol{x}_1) & y_1y_2K(\boldsymbol{x}_1,\boldsymbol{x}_2) & \dots & \dots & y_1y_nK(\boldsymbol{x}_1,\boldsymbol{x}_n) \\ y_2y_1K(\boldsymbol{x}_2,\boldsymbol{x}_1) & y_2y_2(\boldsymbol{x}_2,\boldsymbol{x}_2) & \dots & \dots & y_1y_nK(\boldsymbol{x}_2,\boldsymbol{x}_n) \\ \dots & \dots & \dots & \dots & \dots \\ \dots & \dots & \dots & \dots & \dots \\ y_ny_1K(\boldsymbol{x}_n,\boldsymbol{x}_1) & y_ny_2K(\boldsymbol{x}_n\boldsymbol{x}_2) & \dots & \dots & y_ny_nK(\boldsymbol{x}_n,\boldsymbol{x}_n) \\ \end{bmatrix}\boldsymbol{\lambda}-\mathbb{1}\boldsymbol{\lambda}, $$ subject to \( \boldsymbol{y}^T\boldsymbol{\lambda}=0 \). Here we defined the vectors \( \boldsymbol{\lambda} =[\lambda_1,\lambda_2,\dots,\lambda_n] \) and \( \boldsymbol{y}=[y_1,y_2,\dots,y_n] \). If we add the slack constants this leads to the additional constraint \( 0\leq \lambda_i \leq C \).

We can rewrite this (see the solutions below) in terms of a convex optimization problem of the type $$ \begin{align*} &\mathrm{min}_{\lambda}\hspace{0.2cm} \frac{1}{2}\boldsymbol{\lambda}^T\boldsymbol{P}\boldsymbol{\lambda}+\boldsymbol{q}^T\boldsymbol{\lambda},\\ \nonumber &\mathrm{subject\hspace{0.1cm}to} \hspace{0.2cm} \boldsymbol{G}\boldsymbol{\lambda} \preceq \boldsymbol{h} \hspace{0.2cm} \wedge \boldsymbol{A}\boldsymbol{\lambda}=f. \end{align*} $$ Below we discuss how to solve these equations. Here we note that the matrix \( \boldsymbol{P} \) has matrix elements \( p_{ij}=y_iy_jK(\boldsymbol{x}_i,\boldsymbol{x}_j) \). Given a kernel \( K \) and the targets \( y_i \) this matrix is easy to set up. The constraint \( \boldsymbol{y}^T\boldsymbol{\lambda}=0 \) leads to \( f=0 \) and \( \boldsymbol{A}=\boldsymbol{y} \). How to set up the matrix \( \boldsymbol{G} \) is discussed later. Here note that the inequalities \( 0\leq \lambda_i \leq C \) can be split up into \( 0\leq \lambda_i \) and \( \lambda_i \leq C \). These two inequalities define then the matrix \( \boldsymbol{G} \) and the vector \( \boldsymbol{h} \).