Deakin University
Browse

File(s) under permanent embargo

Approximating Dunn's cluster validity indices for partitions of big data

journal contribution
posted on 2019-05-01, 00:00 authored by Punit Rathore, Zahra Ghafoori, James C Bezdek, Marimuthu Palaniswami, Christopher Leckie
Dunn's internal cluster validity index is used to assess partition quality and subsequently identify a "best" crisp partition of n objects. Computing Dunn's index (DI) for partitions of n p -dimensional feature vector data has quadratic time complexity O(pn2) , so its computation is impractical for very large values of n . This note presents six methods for approximating DI. Four methods are based on Maximin sampling, which identifies a skeleton of the full partition that contains some boundary points in each cluster. Two additional methods are presented that estimate boundary points associated with unsupervised training of one class support vector machines. Numerical examples compare approximations to DI based on all six methods. Four experiments on seven real and synthetic data sets support our assertion that computing approximations to DI with an incremental, neighborhood-based Maximin skeleton is both tractable and reliably accurate.

History

Journal

IEEE transactions on cybernetics

Volume

49

Pagination

1629-1641

Location

Piscataway, N.J.

eISSN

2168-2275

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal

Copyright notice

2018, IEEE

Issue

5

Publisher

Institute of Electrical and Electronics Engineers