Deakin University
Browse

File(s) under permanent embargo

Network traffic clustering using random forest proximities

conference contribution
posted on 2013-01-01, 00:00 authored by Yu Wang, Yang Xiang, Jun Zhang
The recent years have seen extensive work on statistics-based network traffic classification using machine learning (ML) techniques. In the particular scenario of learning from unlabeled traffic data, some classic unsupervised clustering algorithms (e.g. K-Means and EM) have been applied but the reported results are unsatisfactory in terms of low accuracy. This paper presents a novel approach for the task, which performs clustering based on Random Forest (RF) proximities instead of Euclidean distances. The approach consists of two steps. In the first step, we derive a proximity measure for each pair of data points by performing a RF classification on the original data and a set of synthetic data. In the next step, we perform a K-Medoids clustering to partition the data points into K groups based on the proximity matrix. Evaluations have been conducted on real-world Internet traffic traces and the experimental results indicate that the proposed approach is more accurate than the previous methods.

History

Event

IEEE International Conference on Communications (2013 : Budapest, Hungary)

Pagination

2058 - 2062

Publisher

IEEE

Location

Budapest, Hungary

Place of publication

Piscataway, N.J.

Start date

2013-06-09

End date

2013-06-13

ISBN-13

9781467331227

Language

eng

Publication classification

E1 Full written paper - refereed; E Conference publication

Copyright notice

2013, IEEE

Editor/Contributor(s)

D Kim, P Mueller

Title of proceedings

ICC 2013 : IEEE International Conference on Communications

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC