Deakin University
Browse

File(s) under permanent embargo

Sqn2Vec: learning sequence representation via sequential patterns with a gap constraint

conference contribution
posted on 2019-01-01, 00:00 authored by Dang Pham Hai Nguyen, Wei LuoWei Luo, Tu Dinh Nguyen, Svetha VenkateshSvetha Venkatesh, Quoc-Dinh Phung
When learning sequence representations, traditional pattern-based methods often suffer from the data sparsity and high-dimensionality problems while recent neural embedding methods often fail on sequential datasets with a small vocabulary. To address these disadvantages, we propose an unsupervised method (named Sqn2Vec) which first leverages sequential patterns (SPs) to increase the vocabulary size and then learns low-dimensional continuous vectors for sequences via a neural embedding model. Moreover, our method enforces a gap constraint among symbols in sequences to obtain meaningful and discriminative SPs. Consequently, Sqn2Vec produces significantly better sequence representations than a comprehensive list of state-of-the-art baselines, particularly on sequential datasets with a relatively small vocabulary. We demonstrate the superior performance of Sqn2Vec in several machine learning tasks including sequence classification, clustering, and visualization.

History

Event

European Machine Learning and Data Mining. Conference (2018 : Dublin, Ireland)

Volume

11052

Series

European Machine Learning and Data Mining Conference

Pagination

569 - 584

Publisher

Springer

Location

Dublin, Ireland

Place of publication

Cham, Switzerland

Start date

2018-09-10

End date

2018-09-14

ISSN

0302-9743

eISSN

1611-3349

ISBN-13

9783030109271

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2019, Springer Nature Switzerland AG

Editor/Contributor(s)

M Berlingerio, F Bonchi, T Gärtner, N Hurley, G Ifrim

Title of proceedings

ECML-PKDD 2018 : Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC