Mining frequent itemsets in distorted databases with granula computing

Wang, Jinlong, Xu, Congfu and Li, Gang 2009, Mining frequent itemsets in distorted databases with granula computing, International Journal of pattern recognition and artificial intelligence, vol. 23, no. 4, pp. 825-846.

Attached Files
Name Description MIMEType Size Downloads

Title Mining frequent itemsets in distorted databases with granula computing
Author(s) Wang, Jinlong
Xu, Congfu
Li, GangORCID iD for Li, Gang
Journal name International Journal of pattern recognition and artificial intelligence
Volume number 23
Issue number 4
Start page 825
End page 846
Total pages 22
Publisher World Scientific Publishing Co.
Place of publication Toh Tuck Link, Singapore
Publication date 2009-06
ISSN 0218-0014
Keyword(s) granular computing
data mining
frequent itemset
granule inference
Summary Data perturbation is a popular method to achieve privacy-preserving data mining. However, distorted databases bring enormous overheads to mining algorithms as compared to original databases. In this paper, we present the GrC-FIM algorithm to address the efficiency problem in mining frequent itemsets from distorted databases. Two measures are introduced to overcome the weakness in existing work: firstly, the concept of independent granule is introduced, and granule inference is used to distinguish between non-independent itemsets and independent itemsets. We further prove that the support counts of non-independent itemsets can be directly derived from subitemsets, so that the error-prone reconstruction process can be avoided. This could improve the efficiency of the algorithm, and bring more accurate results; secondly, through the granular-bitmap representation, the support counts can be calculated in an efficient way. The empirical results on representative synthetic and real-world databases indicate that the proposed GrC-FIM algorithm outperforms the popular EMASK algorithm in both the efficiency and the support count reconstruction accuracy.
Language eng
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 890205 Information Processing Services (incl. Data Entry and Capture)
HERDC Research category C1 Refereed article in a scholarly journal
HERDC collection year 2009
Copyright notice ©World Scientific Publishing Company
Persistent URL

Document type: Journal Article
Collection: School of Information Technology
Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in TR Web of Science
Scopus Citation Count Cited 2 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 576 Abstract Views, 4 File Downloads  -  Detailed Statistics
Created: Tue, 08 Jun 2010, 14:48:34 EST by Linda Aldridge

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact