Internet traffic classification using machine learning : a token-based approach
conference contribution
posted on 2011-01-01, 00:00authored byYu Wang, Yang Xiang, S Yu
Due to the increasing unreliability of traditional port-based methods, Internet traffic classification has attracted a lot of research efforts in recent years. Quite a lot of previous papers have focused on using statistical characteristics as discriminators and applying machine learning techniques to classify the traffic flows. In this paper, we propose a novel machine learning based approach where the features are extracted from packet payload instead of flow statistics. Specifically, every flow is represented by a feature vector, in which each item indicates the occurrence of a particular token, i.e.; a common substring, in the payload. We have applied various machine learning algorithms to evaluate the idea and used different feature selection schemes to identify the critical tokens. Experimental result based on a real-world traffic data set shows that the approach can achieve high accuracy with low overhead.
History
Event
International Conference on Computational Science and Engineering (14th : 2011 : Dalian, China)
Pagination
285 - 289
Publisher
IEEE
Location
Dalian, China
Place of publication
[Dalian, China]
Start date
2011-08-24
End date
2011-08-26
ISBN-13
9781457709746
ISBN-10
1457709740
Language
eng
Publication classification
E1 Full written paper - refereed
Copyright notice
2011, IEEE
Title of proceedings
CSE 2011 : Proceedings of the 14th IEEE International Conference on Computational Science and Engineering