Support Vector Machines (SVMs) have proven to be an effective approach to learning a classifier from complex datasets. However, highly nonhomogeneous data distributions can pose a challenge for SVMs when the underlying dataset comprises clusters of instances with varying mixtures of class labels. To address this challenge we propose a novel approach, called a cluster-supported Support Vector Machine, in which information derived from clustering can be incorporated directly into the SVM learning process. We provide a theoretical derivation to show that when the total empirical loss is expressed in terms of the combined quadratic empirical loss from each cluster, we can still find a formulation of the optimisation problem that is a convex quadratic programming problem. We discuss the scenarios where this type of model would be beneficial, and present empirical evidence that demonstrates the improved accuracy of our combined model.
History
Volume
10358
Pagination
322-334
Location
New York, USA
Start date
2017-07-15
End date
2017-07-20
ISSN
0302-9743
eISSN
1611-3349
ISBN-13
9783319624150
Publication classification
E Conference publication, E1 Full written paper - refereed
Copyright notice
2017, Springer International Publishing AG
Editor/Contributor(s)
Perner P
Title of proceedings
MLDM 2017 : Proceedings of the 13th International Machine Learning and Data Mining in Pattern Recognition Conference
Event
Machine Learning and Data Mining in Pattern Recognition. Conference (13th : 2017 : New York, USA)