Deakin University
Browse

File(s) under permanent embargo

Density-based clustering using approximate natural neighbours

Version 2 2024-06-04, 12:11
Version 1 2019-10-23, 17:42
journal contribution
posted on 2024-06-04, 12:11 authored by Maia Angelova, Gleb BeliakovGleb Beliakov, Ye ZhuYe Zhu
We propose a computationally efficient natural neighbour based metric for discovering clusters of arbitrary shape based on fuzzy measures. The approximate natural neighbours are found with the help of the Choquet integral with respect to a specially designed two-additive fuzzy measure. Fuzzy betweenness relation is used to construct such a measure and helps determine the natural neighbours of a query point. The natural neighbours of a datum allow the computation of point density estimate, which in turn defines a density based metric suitable for clustering. The proposed method overcomes the exponential computational complexity of the Delaunay triangulation traditionally used to identify the natural neighbours. The run-time of the density estimate by this metric keeps the same quadratic trends as the Euclidean distance based estimates with respect to the data size. Empirical evaluation based on 20 synthetic and real-world datasets shows that this metric has a higher clustering accuracy for existing state-of-the-art density-based clustering algorithms, such as DBSCAN, SNN and DP. Furthermore, the proposed metric is easily combined with these algorithms, and the enhanced clustering algorithms inherit positive features such as resistance to noise and imbalanced data.

History

Journal

APPLIED SOFT COMPUTING

Volume

85

Article number

ARTN 105867

Pagination

1 - 10

Location

Amsterdam, The Netherlands

ISSN

1568-4946

eISSN

1872-9681

Language

English

Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2019, Elsevier

Publisher

ELSEVIER