File(s) under permanent embargo
Approximate cluster heat maps of large high-dimensional data
conference contribution
posted on 2018-01-01, 00:00 authored by Punit Rathore, J C Bezdek, D Kumar, Sutharshan RajasegararSutharshan Rajasegarar, M PalaniswamiThe problem of determining whether clusters are present in numerical data (tendency assessment) is an important first step of cluster analysis. One tool for cluster tendency assessment is the visual assessment of tendency (VAT) algorithm. VAT and improved VAT (iVAT) produce an image that provides visual evidence about the number of clusters to seek in the original dataset. These methods have been successful in determining potential cluster structure in various datasets, but they can be computationally expensive for datasets with a very large number of samples. A scalable version of iVAT called siVAT approximates iVAT images, but siVAT can be computationally expensive for big datasets. In this article, we introduce a modification of siVAT called siVAT+ which approximates cluster heat maps for large volumes of high dimensional data much more rapidly than siVAT. We compare siVAT+ with siVAT on six large, high dimensional datasets. Experimental results confirm that siVAT+ obtains images similar to siVAT images in a few seconds, and is 8 - 55 times faster than siVAT.
History
Event
International Association for Pattern Recognition. Conference (24th : 2018 : Beijing, China)Pagination
195 - 200Publisher
Institute of Electrical and Electronics EngineersLocation
Beijing, ChinaPlace of publication
Piscataway, N.J.Publisher DOI
Start date
2018-08-20End date
2018-08-24ISSN
1051-4651ISBN-13
9781538637883Language
engPublication classification
E Conference publication; E1 Full written paper - refereedCopyright notice
2018, IEEEEditor/Contributor(s)
[Unknown]Title of proceedings
ICPR 2018 : Proceedings of the 24th International Conference on Pattern RecognitionUsage metrics
Categories
Keywords
cluster tendency assessmentbig data cluster analysishigh-dimensional dataapproximate cluster heat mapsScience & TechnologyTechnologyComputer Science, Artificial IntelligenceComputer ScienceVISUAL ASSESSMENTTENDENCYNUMBERInformation SystemsArtificial Intelligence and Image ProcessingDistributed Computing
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC