File(s) under permanent embargo
Estimating generalized Dunn's cluster validity indices for big data
Version 2 2024-06-06, 10:55Version 2 2024-06-06, 10:55
Version 1 2019-05-01, 10:49Version 1 2019-05-01, 10:49
conference contribution
posted on 2024-06-06, 10:55 authored by P Rathore, Z Ghafoori, JC Bezdek, M Palaniswami, C Leckie© 2018 IEEE. Dunn's internal cluster validity index and its generalizations assess partition quality. For partitions of n samples of p-dimensional feature vector data, all but two of the generalized Dunn's indices (GDIs) have quadratic time complexity O(pn 2 ), so computation is untenable for very large values of n. In this paper, we present two methods for approximating GDIs based on Maximin (MM) Sampling. MM sampling identifies a skeleton of the full partition that usually contains some of the boundary points in each cluster which are used to compute GDIs. We compare our algorithms with a support vector machine based boundary extraction method and a random sampling based estimation method. Our experiments on four real and synthetic datasets show that computing approximations to (three) GDIs with the MM skeleton is both computationally tractable and reliably accurate.
History
Pagination
656-661Location
Miyazaki, JapanPublisher DOI
Start date
2018-10-07End date
2018-10-10ISBN-13
9781538666500Language
engPublication classification
E1.1 Full written paper - refereedCopyright notice
2018, IEEETitle of proceedings
SMC 2018 : Proceedings of the IEEE International Conference on Systems, Man, and CyberneticsEvent
Systems, Man, and Cybernetics. International Conference (2018 : Miyazaki, Japan)Publisher
IEEEPlace of publication
Piscataway, N.J.Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorksRefWorks
BibTeXBibTeX
Ref. managerRef. manager
EndnoteEndnote
DataCiteDataCite
NLMNLM
DCDC