Deakin University
Browse

File(s) under permanent embargo

Applying clustering and ensemble clustering approaches to phishing profiling

conference contribution
posted on 2009-01-01, 00:00 authored by D Webb, John YearwoodJohn Yearwood, L Ma, P Vamplew, Bahadorreza OfoghiBahadorreza Ofoghi, A Kelarev
This paper describes a novel approach to profiling phishing emails based on the combination of multiple independent clusterings of the email documents. Each clustering is motivated by a natural representation of the emails. A data set of 2048 phishing emails provided by a major Australian financial institution was preprocessed to extract features describing the textual content, hyperlinks and orthographic structure of the emails. Independent clusterings using different techniques were performed on each representation, and these clusterings were then ensembled using a variety of consensus functions. This paper concentrates on using several clustering approaches to determine the most likely number of phishing groups and explores ways in which individual and combined results relate. The approach suggests a number of phishing groups and the structure of the approach can aid the development of profiles based on the individual clusters. The actual profiling is not carried out in this paper. © 2009, Australian Computer Society, Inc.

History

Volume

101

Pagination

25-34

Location

Melbourne, Victoria

Start date

2009-12-01

End date

2009-12-04

ISSN

1445-1336

Language

eng

Publication classification

E1.1 Full written paper - refereed, E Conference publication

Title of proceedings

AusDM 2009 : Proceedings of the Australasian Data Mining Conference

Event

Australasian Data Mining. Conference (2009 : Melbourne, Victoria)

Publisher

Australian Computer Society

Place of publication

Melbourne, Vic.

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC