Introduction
This document has been written in an attempt to make the Support Vector
Machines (SVM), initially conceived of by Cortes and Vapnik [1], as sim-
ple to understand as possible for those with minimal experience of Machine
Learning. It assumes basic mathematical knowledge in areas such as cal-
culus, vector geometry and Lagrange multipliers. The document has been
split into Theory and Application sections so that it is obvious, after the
maths has been dealt with, how to actually apply the SVM for the different
forms of problem that each section is centred on.
The document’s first section details the problem of classification for linearly
separable data and introduces the concept of margin and the essence of SVM
- margin maximization. The methodology of the SVM is then extended to
data which is not fully linearly separable. This soft margin SVM introduces
the idea of slack variables and the trade-off between maximizing the margin
and minimizing the number of misclassified variables in the second section.
The third section develops the concept of SVM further so that the technique
can be used for regression.
The fourth section explains the other salient feature of SVM - the Kernel
Trick. It explains how incorporation of this mathematical sleight of hand
allows SVM to classify and regress nonlinear data.
Other than Cortes and Vapnik [1], most of this document is based on work
by Cristianini and Shawe-Taylor [2], [3], Burges [4] and Bishop [5].
For any comments on or questions about this document, please contact the
author through the URL on the title page.
Acknowledgments
The author would like to thank John Shawe-Taylor and Martin Sewell for
their assitance in checking this document.
1
评论0
最新资源