File(s) under permanent embargo
Complex statistical analysis of big data: implementation and application of Apriori and FP-Growth algorithm based on MapReduce
In the single machine environment, the problems of Apriori and FP-Growth algorithm in large-scale data association rules mining are high memory consumption, low computing performance, poor scalability and reliability and so on. Therefore, we put forward a new implementation method which is based on MapReduce parallel environment for mining frequent itemsets to generate association rules and is verified by using different sizes of real datasets with different nodes in the cluster, meanwhile, selecting 'speedup, scalability and reliability' as an indicator. The results show that our method is feasible and valid and is able to improve the overall performance and efficiency of Apriori and FP-Growth algorithm to meet the needs of large-scale data association rules mining.
History
Event
Software Engineering and Service Science. Conference (4th : 2013 : Beijing, China)Series
Software Engineering and Service Science ConferencePagination
968 - 972Publisher
Institute of Electrical and Electronics EngineersLocation
Beijing, ChinaPlace of publication
Piscataway, N.J.Publisher DOI
Start date
2013-05-23End date
2013-05-25ISSN
2327-0586eISSN
2327-0594ISBN-13
9781467349970Language
engPublication classification
E1 Full written paper - refereedCopyright notice
2013, IEEEEditor/Contributor(s)
[Unknown]Title of proceedings
ICSESS : Proceedings of the 2013 IEEE 4th International Conference on Software Engineering and Service SciencesUsage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC