An application of novel clustering technique for information security

Beliakov, Gleb; Yearwood, J; Kelarev, Andrei

File(s) under permanent embargo

An application of novel clustering technique for information security

conference contribution

posted on 2011-01-01, 00:00 authored by Gleb BeliakovGleb Beliakov, J Yearwood, Andrei Kelarev

This article presents experimental results devoted to a new application of the novel clustering technique introduced by the authors recently. Our aim is to facilitate the application of robust and stable consensus functions in information security, where it is often necessary to process large data sets and monitor outcomes in real time, as it is required, for example, for intrusion detection. Here we concentrate on the particular case of application to profiling of phishing websites. First, we apply several independent clustering algorithms to a randomized sample of data to obtain independent initial clusterings. Silhouette index is used to determine the number of clusters. Second, we use a consensus function to combine these independent clusterings into one consensus clustering . Feature ranking is used to select a subset of features for the consensus function. Third, we train fast supervised classification algorithms on the resulting consensus clustering in order to enable them to process the whole large data set as well as new data. The precision and recall of classifiers at the final stage of this scheme are critical for effectiveness of the whole procedure. We investigated various combinations of three consensus functions, Cluster-Based Graph Formulation (CBGF), Hybrid Bipartite Graph Formulation (HBGF), and Instance-Based Graph Formulation (IBGF) and a variety of supervised classification algorithms. The best precision and recall have been obtained by the combination of the HBGF consensus function and the SMO classifier with the polynomial kernel.

History

Event

Applications and Techniques in Information Security Workshop (2nd : 2011 : Melbourne, Vic.)

Pagination

6 - 11

Publisher

School of Information Systems, Deakin University

Location

Melbourne, Vic.

Place of publication

Melbourne

Start date

2011-11-09

ISBN-13

9780987229809

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2011, Deakin University

Editor/Contributor(s)

M Warren

Title of proceedings

ATIS 2011 : Workshop proceedingof ATIS 2011. Melbourne, November 9th, 2011. Second Applications and Techniques in Information Security Workshop

Usage metrics

Keywords

consensus functions clustering classification phishing websites

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

An application of novel clustering technique for information security

History

Event

Pagination

Publisher

Location

Place of publication

Start date

ISBN-13

Language

Publication classification

Copyright notice

Editor/Contributor(s)

Title of proceedings

Usage metrics

Categories

Keywords

Licence

Exports