Deakin University
Browse
zhang-anefficientmap-2015.pdf (6.57 MB)

An efficient MapReduce-based parallel clustering algorithm for distributed traffic subarea division

Download (6.57 MB)
journal contribution
posted on 2015-01-01, 00:00 authored by D Xia, B Wang, Y Li, Z Rong, Zili ZhangZili Zhang
Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase K -Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy of K -Means and then employ a MapReduce paradigm to redesign the optimized K -Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared with K -Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.

History

Journal

Discrete dynamics in nature and society

Volume

2015

Article number

793010

Pagination

1 - 18

Publisher

Hindawi Publishing Corp.

Location

Cairo, Egypt

ISSN

1026-0226

eISSN

1607-887X

Language

eng

Publication classification

C Journal article; C1 Refereed article in a scholarly journal

Copyright notice

2015, The Authors

Usage metrics

    Research Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC