Sqn2Vec: learning sequence representation via sequential patterns with a gap constraint

Nguyen, Dang Pham Hai; Luo, Wei; Nguyen, Tu Dinh; Venkatesh, Svetha; Phung, Quoc-Dinh

File(s) under permanent embargo

Sqn2Vec: learning sequence representation via sequential patterns with a gap constraint

conference contribution

posted on 2019-01-01, 00:00 authored by Dang Pham Hai Nguyen, Wei LuoWei Luo, Tu Dinh Nguyen, Svetha VenkateshSvetha Venkatesh, Quoc-Dinh Phung

When learning sequence representations, traditional pattern-based methods often suffer from the data sparsity and high-dimensionality problems while recent neural embedding methods often fail on sequential datasets with a small vocabulary. To address these disadvantages, we propose an unsupervised method (named Sqn2Vec) which first leverages sequential patterns (SPs) to increase the vocabulary size and then learns low-dimensional continuous vectors for sequences via a neural embedding model. Moreover, our method enforces a gap constraint among symbols in sequences to obtain meaningful and discriminative SPs. Consequently, Sqn2Vec produces significantly better sequence representations than a comprehensive list of state-of-the-art baselines, particularly on sequential datasets with a relatively small vocabulary. We demonstrate the superior performance of Sqn2Vec in several machine learning tasks including sequence classification, clustering, and visualization.

History

Event

European Machine Learning and Data Mining. Conference (2018 : Dublin, Ireland)

Volume

11052

Series

European Machine Learning and Data Mining Conference

Pagination

569 - 584

Publisher

Springer

Location

Dublin, Ireland

Place of publication

Cham, Switzerland

Publisher DOI

https://doi.org/10.1007/978-3-030-10928-8_34

Start date

2018-09-10

End date

2018-09-14

ISSN

0302-9743

eISSN

1611-3349

ISBN-13

9783030109271

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2019, Springer Nature Switzerland AG

Editor/Contributor(s)

M Berlingerio, F Bonchi, T Gärtner, N Hurley, G Ifrim

Title of proceedings

ECML-PKDD 2018 : Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases

Usage metrics

Keywords

Sqn2Vec sequence representations data sparsity high-dimensionality problems sequential patterns (SPs)Science & Technology Technology Computer Science, Artificial Intelligence Computer Science

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Sqn2Vec: learning sequence representation via sequential patterns with a gap constraint

History

Event

Volume

Series

Pagination

Publisher

Location

Place of publication

Publisher DOI

Start date

End date

ISSN

eISSN

ISBN-13

Language

Publication classification

Copyright notice

Editor/Contributor(s)

Title of proceedings

Usage metrics

Categories

Keywords

Licence

Exports