Deakin University
Browse

Fast modified global k-means algorithm for incremental cluster construction

Version 2 2024-06-04, 13:50
Version 1 2018-12-05, 15:19
journal contribution
posted on 2024-06-04, 13:50 authored by AM Bagirov, Julien UgonJulien Ugon, D Webb
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms.

History

Journal

Pattern recognition

Volume

44

Pagination

866-876

Location

Amsterdam, The Netherlands

ISSN

0031-3203

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal

Copyright notice

2010, Elsevier Ltd.

Issue

4

Publisher

Elsevier