File(s) under permanent embargo
A New Effective and Efficient Measure for Outlying Aspect Mining
conference contribution
posted on 2020-01-01, 00:00 authored by Durgesh Samariya, Sunil AryalSunil Aryal, Kai Ming Ting, Jiangang MaOutlying Aspect Mining (OAM) aims to find the subspaces (a.k.a. aspects) in which a given query is an outlier with respect to a given data set. Existing OAM algorithms use traditional distance/density-based outlier scores to rank subspaces. Because these distance/density-based scores depend on the dimensionality of subspaces, they cannot be compared directly between subspaces of different dimensionality. Z-score normalisation has been used to make them comparable. It requires to compute outlier scores of all instances in each subspace. This adds significant computational overhead on top of already expensive density estimation—making OAM algorithms infeasible to run in large and/or high-dimensional datasets. We also discover that Z-score normalisation is inappropriate for OAM in some cases. In this paper, we introduce a new score called Simple Isolation score using Nearest Neighbor Ensemble (SiNNE), which is independent of the dimensionality of subspaces. This enables the scores in subspaces with different dimensionalities to be compared directly without any additional normalisation. Our experimental results revealed that SiNNE produces better or at least the same results as existing scores; and it significantly improves the runtime of an existing OAM algorithm based on beam search.
History
Event
Web Information Systems Engineering. Conference (2020 : Amsterdam, The Netherlands)Volume
12343Series
Lecture Notes in Computer SciencePagination
463 - 474Publisher
SpringerLocation
Amsterdam, The NetherlandsPlace of publication
Berlin, GermanyPublisher DOI
Start date
2020-10-20End date
2020-10-24ISSN
0302-9743eISSN
1611-3349ISBN-13
9783030620073Language
engPublication classification
E1 Full written paper - refereedTitle of proceedings
WISE 2020 : Proceedings of the 2020 International Conference on Web Information Systems EngineeringUsage metrics
Keywords
Science & TechnologyTechnologyComputer Science, Artificial IntelligenceComputer Science, Information SystemsComputer Science, Software EngineeringComputer Science, Theory & MethodsComputer ScienceOutlying aspect miningDimensionality-unbiased scoreOutlier explanationNearest neighbor ensembleCORE2020 AArtificial Intelligence and Image Processing
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC