Data Analysis and Machine Learning: Support Vector Machines

Loading [MathJax]/extensions/TeX/boldsymbol.js

A $p$ -dimensional space of features

We limit ourselves to two classes of outputs $y_i$ and assign these classes the values $y_i = \pm 1$ . In a $p$ -dimensional space of say $p$ features we have a hyperplane defines as $b+wx_1+w_2x_2+\dots +w_px_p=0.$ If we define a matrix $\boldsymbol{X}=\left[\boldsymbol{x}_1,\boldsymbol{x}_2,\dots, \boldsymbol{x}_p\right]$ of dimension $n\times p$ , where $n$ represents the observations for each feature and each vector $x_i$ is a column vector of the matrix $\boldsymbol{X}$ , $\boldsymbol{x}_i = \begin{bmatrix} x_{i1} \\ x_{i2} \\ \dots \\ \dots \\ x_{ip} \end{bmatrix}.$ If the above condition is not met for a given vector $\boldsymbol{x}_i$ we have $b+w_1x_{i1}+w_2x_{i2}+\dots +w_px_{ip} >0,$ if our output $y_i=1$ . In this case we say that $\boldsymbol{x}_i$ lies on one of the sides of the hyperplane and if $b+w_1x_{i1}+w_2x_{i2}+\dots +w_px_{ip} < 0,$ for the class of observations $y_i=-1$ , then $\boldsymbol{x}_i$ lies on the other side.

Equivalently, for the two classes of observations we have $y_i\left(b+w_1x_{i1}+w_2x_{i2}+\dots +w_px_{ip}\right) > 0.$

When we try to separate hyperplanes, if it exists, we can use it to construct a natural classifier: a test observation is assigned a given class depending on which side of the hyperplane it is located.

A p p -dimensional space of features

A $p$ -dimensional space of features