A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models

Naseem, U, Razzak, Muhammad Imran, Khan, SK and Prasad, M 2021, A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models, ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 20, no. 5, pp. 1-35, doi: 10.1145/3434237.

Attached Files
Name Description MIMEType Size Downloads

Title A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models
Author(s) Naseem, U
Razzak, Muhammad ImranORCID iD for Razzak, Muhammad Imran orcid.org/0000-0002-3930-6600
Khan, SK
Prasad, M
Journal name ACM Transactions on Asian and Low-Resource Language Information Processing
Volume number 20
Issue number 5
Start page 1
End page 35
Total pages 35
Publisher Association for Computing Machinery
Place of publication New York, N.Y.
Publication date 2021
ISSN 2375-4699
2375-4702
Keyword(s) Text mining
natural language processing
word representation
language models
Summary Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP-related tasks. In the end, this survey briefly discusses the commonly used ML- and DL-based classifiers, evaluation metrics, and the applications of these word embeddings in different NLP tasks.
Language eng
DOI 10.1145/3434237
Indigenous content off
HERDC Research category C1 Refereed article in a scholarly journal
Persistent URL http://hdl.handle.net/10536/DRO/DU:30153522

Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 21 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Tue, 13 Jul 2021, 20:41:32 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.