File(s) under permanent embargo

Internet traffic classification using machine learning : a token-based approach

conference contribution
posted on 2011-01-01, 00:00 authored by Yu Wang, Yang Xiang, S Yu
Due to the increasing unreliability of traditional port-based methods, Internet traffic classification has attracted a lot of research efforts in recent years. Quite a lot of previous papers have focused on using statistical characteristics as discriminators and applying machine learning techniques to classify the traffic flows. In this paper, we propose a novel machine learning based approach where the features are extracted from packet payload instead of flow statistics. Specifically, every flow is represented by a feature vector, in which each item indicates the occurrence of a particular token, i.e.; a common substring, in the payload. We have applied various machine learning algorithms to evaluate the idea and used different feature selection schemes to identify the critical tokens. Experimental result based on a real-world traffic data set shows that the approach can achieve high accuracy with low overhead.

History

Event

International Conference on Computational Science and Engineering (14th : 2011 : Dalian, China)

Pagination

285 - 289

Publisher

IEEE

Location

Dalian, China

Place of publication

[Dalian, China]

Start date

2011-08-24

End date

2011-08-26

ISBN-13

9781457709746

ISBN-10

1457709740

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2011, IEEE

Title of proceedings

CSE 2011 : Proceedings of the 14th IEEE International Conference on Computational Science and Engineering