Deakin University
Browse

Online mining of frequent sets in data streams with error guarantee

journal contribution
posted on 2008-08-01, 00:00 authored by X Dang, W K Ng, Kok-Leong Ong
For most data stream applications, the volume of data is too huge to be stored in permanent devices or to be thoroughly scanned more than once. It is hence recognized that approximate answers are usually sufficient, where a good approximation obtained in a timely manner is often better than the exact answer that is delayed beyond the window of opportunity. Unfortunately, this is not the case for mining frequent patterns over data streams where algorithms capable of online processing data streams do not conform strictly to a precise error guarantee. Since the quality of approximate answers is as important as their timely delivery, it is necessary to design algorithms to meet both criteria at the same time. In this paper, we propose an algorithm that allows online processing of streaming data and yet guaranteeing the support error of frequent patterns strictly within a user-specified threshold. Our theoretical and experimental studies show that our algorithm is an effective and reliable method for finding frequent sets in data stream environments when both constraints need to be satisfied.

History

Journal

Knowledge and information systems

Volume

16

Issue

2

Pagination

245 - 258

Publisher

Springer UK

Location

London, England

ISSN

0219-1377

eISSN

0219-3116

Language

eng

Notes

Published online September 22, 2007

Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2007, Springer-Verlag London Limited