Processing math: 100%

 

 

 

Different kernels and Mercer's theorem

There are several popular kernels being used. These are

  1. Linear: K(x,y)=xTy,
  2. Polynomial: K(x,y)=(xTy+γ)d,
  3. Gaussian Radial Basis Function: K(x,y)=exp(γ||xy||2),
  4. Tanh: K(x,y)=tanh(xTy+γ),
and many other ones.

An important theorem for us is Mercer's theorem. The theorem states that if a kernel function K is symmetric, continuous and leads to a positive semi-definite matrix P then there exists a function ϕ that maps xi and xj into another space (possibly with much higher dimensions) such that K(xi,xj)=ϕ(xi)Tϕ(xj). So you can use K as a kernel since you know ϕ exists, even if you don’t know what ϕ is. Note that some frequently used kernels (such as the Sigmoid kernel) don’t respect all of Mercer’s conditions, yet they generally work well in practice.