File(s) under permanent embargo
Spamcooling : a parallel heterogeneous ensemble spam filtering system based on active learning techniques
journal contributionposted on 2010-06-01, 00:00 authored by J Wang, K Gao, Huy Quan Vu
Anti-spam technology is developing rapidly in recent years. With the emerging applications of machine learning in diverse fields, researchers as well as manufacturers around the world have attempted a large number of related algorithms to prevent spam. In this paper, we designed an effective anti-spam protection system, SpamCooling, based on the mechanism of active learning and parallel heterogeneous ensemble learning techniques. The system adopts a batch method to filter spam and can be easily incorporated with existing mail clients (MUA). It can actively obtain user feedbacks for providing users with personalized spam filtering experiences. The parallel heterogeneous ensemble method can help system achieve high spam detection rate as well as low ham misclassification rate.