Complex statistical analysis of big data: implementation and application of Apriori and FP-Growth algorithm based on MapReduce

Rong, Z; Xia, D; Zhang, Zili

File(s) under permanent embargo

Complex statistical analysis of big data: implementation and application of Apriori and FP-Growth algorithm based on MapReduce

conference contribution

posted on 2013-01-01, 00:00 authored by Z Rong, D Xia, Zili ZhangZili Zhang

In the single machine environment, the problems of Apriori and FP-Growth algorithm in large-scale data association rules mining are high memory consumption, low computing performance, poor scalability and reliability and so on. Therefore, we put forward a new implementation method which is based on MapReduce parallel environment for mining frequent itemsets to generate association rules and is verified by using different sizes of real datasets with different nodes in the cluster, meanwhile, selecting 'speedup, scalability and reliability' as an indicator. The results show that our method is feasible and valid and is able to improve the overall performance and efficiency of Apriori and FP-Growth algorithm to meet the needs of large-scale data association rules mining.

History

Event

Software Engineering and Service Science. Conference (4th : 2013 : Beijing, China)

Series

Software Engineering and Service Science Conference

Pagination

968 - 972

Publisher

Institute of Electrical and Electronics Engineers

Location

Beijing, China

Place of publication

Piscataway, N.J.

Publisher DOI

https://doi.org/10.1109/ICSESS.2013.6615467

Start date

2013-05-23

End date

2013-05-25

ISSN

2327-0586

eISSN

2327-0594

ISBN-13

9781467349970

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2013, IEEE

Editor/Contributor(s)

[Unknown]

Title of proceedings

ICSESS : Proceedings of the 2013 IEEE 4th International Conference on Software Engineering and Service Sciences

Usage metrics

Keywords

Big data statistics Association analysis MapReduce Apriori FP-Growth

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Complex statistical analysis of big data: implementation and application of Apriori and FP-Growth algorithm based on MapReduce

History

Event

Series

Pagination

Publisher

Location

Place of publication

Publisher DOI

Start date

End date

ISSN

eISSN

ISBN-13

Language

Publication classification

Copyright notice

Editor/Contributor(s)

Title of proceedings

Usage metrics

Categories

Keywords

Licence

Exports