Deakin University
Browse

File(s) not publicly available

Scalable Cluster Tendency Assessment for Streaming Activity Data using Recurring Shapelets

journal contribution
posted on 2023-02-14, 04:15 authored by S Datta, Chandan KarmakarChandan Karmakar, P Rathore, M Palaniswami
Automatic interpretation of cluster structure in rapidly arriving data streams is essential for timely detection of interesting events. Human activities often contain bursts of repeating patterns. In this paper, we propose a new relative of the Visual Assessment of Cluster Tendency (VAT) model, to interpret cluster evolution in streaming activity data where shapes of recurring patterns are important. Existing VAT algorithms are either suitable only for small batch data and unscalable to rapidly evolving streams, or cannot capture shape patterns. Our proposed incremental algorithm processes streaming data in chunks and identifies repeating patterns or shapelets from each chunk, creating a Dictionary-of-Shapes (DoS) that is updated on the fly. Each chunk is transformed into a lower dimensional representation based on it's distance from the shapelets in the current DoS. Then a small set of transformed chunks are sampled using an intelligent Maximin Random Sampling (MMRS) scheme, to create a scalable VAT image that is incrementally updated as the data stream progresses. Experiments on two upper limb activity datasets demonstrate that the proposed method can successfully and efficiently visualize clusters in long streams of data and can also identify anomalous movements.

History

Journal

Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS

Volume

2022-July

Pagination

1036-1040

Location

Piscataway, N.J.

ISSN

1557-170X

eISSN

2694-0604

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal

Publisher

IEEE

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC