File(s) under permanent embargo
Sqn2Vec: learning sequence representation via sequential patterns with a gap constraint
conference contribution
posted on 2019-01-01, 00:00 authored by Dang Pham Hai Nguyen, Wei LuoWei Luo, Tu Dinh Nguyen, Svetha VenkateshSvetha Venkatesh, Quoc-Dinh PhungWhen learning sequence representations, traditional pattern-based methods often suffer from the data sparsity and high-dimensionality problems while recent neural embedding methods often fail on sequential datasets with a small vocabulary. To address these disadvantages, we propose an unsupervised method (named Sqn2Vec) which first leverages sequential patterns (SPs) to increase the vocabulary size and then learns low-dimensional continuous vectors for sequences via a neural embedding model. Moreover, our method enforces a gap constraint among symbols in sequences to obtain meaningful and discriminative SPs. Consequently, Sqn2Vec produces significantly better sequence representations than a comprehensive list of state-of-the-art baselines, particularly on sequential datasets with a relatively small vocabulary. We demonstrate the superior performance of Sqn2Vec in several machine learning tasks including sequence classification, clustering, and visualization.
History
Event
European Machine Learning and Data Mining. Conference (2018 : Dublin, Ireland)Volume
11052Series
European Machine Learning and Data Mining ConferencePagination
569 - 584Publisher
SpringerLocation
Dublin, IrelandPlace of publication
Cham, SwitzerlandPublisher DOI
Start date
2018-09-10End date
2018-09-14ISSN
0302-9743eISSN
1611-3349ISBN-13
9783030109271Language
engPublication classification
E1 Full written paper - refereedCopyright notice
2019, Springer Nature Switzerland AGEditor/Contributor(s)
M Berlingerio, F Bonchi, T Gärtner, N Hurley, G IfrimTitle of proceedings
ECML-PKDD 2018 : Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in DatabasesUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC