Deakin University

File(s) under permanent embargo

Fast and scalable big data trajectory clustering for understanding urban mobility

journal contribution
posted on 2018-11-01, 00:00 authored by D Kumar, H Wu, Sutharshan RajasegararSutharshan Rajasegarar, C Leckie, S Krishnaswamy, M Palaniswami
IEEE Clustering of large-scale vehicle trajectories is an important aspect for understanding urban traffic patterns, particularly for optimizing public transport routes and frequencies and improving the decisions made by authorities. Existing trajectory clustering schemes are not well suited to large numbers of trajectories in dense city road networks due to the difficulty in finding a representative distance measure between trajectories that can scale to very large datasets. In this paper, we propose a novel Dijkstra-based dynamic time warping distance measure, trajDTW between two trajectories, which is suitable for large numbers of overlapping trajectories in a dense road network as found in major cities around the world. We also propose a novel fast-clusiVAT algorithm that can suggest the number of clusters in a trajectory dataset and identify and visualize the trajectories belonging to each cluster. We conduct experiments on a large-scale taxi trajectory dataset consisting of 3.28 million trajectories obtained from the GPS traces of 15,061 taxis within Singapore over a period of one month. Our analysis finds 13 trajectory clusters spanning the major expressways of Singapore, each of which can be further divided into two sub-clusters based on the travel direction. For each cluster, we provide a time-based distribution of trajectories to yield insights into how urban mobility patterns change with the time of day. We compare the trajectory clusters obtained using our approach with those obtained using popular general and trajectory specific clustering frameworks: DBSCAN, OPTICS, NETSCAN, and NEAT. We demonstrate that the clusters obtained using our novel fast-clusiVAT framework are better than those obtained using other clustering schemes, evaluated based on two internal cluster validity measures: Dunn's and Silhouette indices. Moreover, our fast-clusiVAT algorithm achieves significant speedup over a comparable approach without loss of cluster quality.



IEEE transactions on intelligent transportation systems






3709 - 3722




Piscataway, N.J.





Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2018, IEEE