Tag Archives: tikz

Tikz example – Kernel trick

In Support Vector Machines, the learning algorithms can only solve linearly separable problems. However, this isn’t strictly true. Since all feature vectors only occurred in dot-products k(xi,xj)= xi·xj, the “kernel trick” can be applied, by replacing dot-products by another kernel (Boser et al., 1992). A more formal statement of kernel trick is that

Given an algorithm which is formulated in terms of a positive definite kernel k, one can construct an alternative algorithm by replacing k by another positive definite kernel k∗ (Schlkopf and Smola, 2002).

The best known application of the kernel trick is in the case where k is the dot-product, but the trick is not limited to that case: both k and k can be nonlinear kernels. More general, given any feature map φ from observations into a inner product space, we obtain a kernel k(xi,xj)=φ(xi)·φ(xj).

This figure was drawn for “kernel trick” with samples from two classes.

Tikz example – SVM trained with samples from two classes

In machine learning, Support Vector Machines are supervised learning models used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier. To classify examples, we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier.