You are not logged in.
Openly accessible

A MapReduce-based nearest neighbor approach for big-data-driven traffic flow prediction

Xia, Dawen, Li, Huaqing, Wang, Binfeng, Li, Yantao and Zhang, Zili 2016, A MapReduce-based nearest neighbor approach for big-data-driven traffic flow prediction, IEEE access, vol. 4, pp. 2920-2934, doi: 10.1109/ACCESS.2016.2570021.

Attached Files
Name Description MIMEType Size Downloads
zhang-amapreducebased-2016.pdf Published version application/pdf 21.00MB 9

Title A MapReduce-based nearest neighbor approach for big-data-driven traffic flow prediction
Author(s) Xia, Dawen
Li, Huaqing
Wang, Binfeng
Li, Yantao
Zhang, ZiliORCID iD for Zhang, Zili orcid.org/0000-0002-8721-9333
Journal name IEEE access
Volume number 4
Start page 2920
End page 2934
Total pages 15
Publisher IEEE
Place of publication Piscataway, N. J.
Publication date 2016-01-01
ISSN 2169-3536
Keyword(s) Big data analytics
traf c ow prediction
correlation analysis
parallel classi er
Hadoop MapReduce
Summary In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naïve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup.
Language eng
DOI 10.1109/ACCESS.2016.2570021
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category C1 Refereed article in a scholarly journal
ERA Research output type C Journal article
Copyright notice ©2016, IEEE
Free to Read? Yes
Persistent URL http://hdl.handle.net/10536/DRO/DU:30085586

Document type: Journal Article
Collections: School of Information Technology
Open Access Collection
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 57 Abstract Views, 12 File Downloads  -  Detailed Statistics
Created: Wed, 12 Oct 2016, 15:00:51 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.