Performance analysis of algorithms for frequent pattern generation

Islam, Md. Rafiqul, Chowdhury, Morshed and Khan, Safwan Mahmood 2004, Performance analysis of algorithms for frequent pattern generation, in Complex 2004: Proceedings of the 7th Asia-Pacific Complex Systems Conference, Central Queensland University, Rockhampton, Qld, pp. 43-55.

Attached Files
Name Description MIMEType Size Downloads

Title Performance analysis of algorithms for frequent pattern generation
Author(s) Islam, Md. Rafiqul
Chowdhury, Morshed
Khan, Safwan Mahmood
Conference name Asia-Pacific Complex Systems Conference (7th : 2004 : Cairns, Qld.)
Conference location Cairns, Australia
Conference dates 6-10 December 2004
Title of proceedings Complex 2004: Proceedings of the 7th Asia-Pacific Complex Systems Conference
Editor(s) Stonier, Russel
Han, Qinglong
Li, Wei
Publication date 2004
Start page 43
End page 55
Publisher Central Queensland University
Place of publication Rockhampton, Qld
Keyword(s) data mining
association rules
frequent Pattern
DCP algorithm
PIP algorithm
PD algorithm
Summary Data mining refers to extracting or "mining" knowledge from large amounts of data. It is also called a method of "knowledge presentation" where visualization and knowledge representation techniques are used to present the mined knowledge to the user. Efficient algorithms to mine frequent patterns are crucial to many tasks in data mining. Since the Apriori algorithm was proposed in 1994, there have been several methods proposed to improve its performance. However, most still adopt its candidate set generation-and-test approach. In addition, many methods do not generate all frequent patterns, making them inadequate to derive association rules. The Pattern Decomposition (PD) algorithm that can significantly reduce the size of the dataset on each pass makes it more efficient to mine all frequent patterns in a large dataset. This algorithm avoids the costly process of candidate set generation and saves a large amount of counting time to evaluate support with reduced datasets. In this paper, some existing frequent pattern generation algorithms are explored and their comparisons are discussed. The results show that the PD algorithm outperforms an improved version of Apriori named Direct Count of candidates & Prune transactions (DCP) by one order of magnitude and is faster than an improved FP-tree named as Predictive Item Pruning (PIP). Further, PD is also more scalable than both DCP and PIP.
ISBN 1876674962
9781876674960
Language eng
Field of Research 080109 Pattern Recognition and Data Mining
HERDC Research category E1 Full written paper - refereed
ERA Research output type E Conference publication
Persistent URL http://hdl.handle.net/10536/DRO/DU:30005388

Document type: Conference Paper
Collection: School of Information Technology
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Access Statistics: 417 Abstract Views, 1 File Downloads  -  Detailed Statistics
Created: Mon, 07 Jul 2008, 09:49:06 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.