A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models

Naseem, Usman; Razzak, Imran; Khan, Shah Khalid; Prasad, Mukesh

File(s) under permanent embargo

A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models

journal contribution

posted on 2021-01-01, 00:00 authored by Usman Naseem, Imran RazzakImran Razzak, Shah Khalid Khan, Mukesh Prasad

Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP-related tasks. In the end, this survey briefly discusses the commonly used ML- and DL-based classifiers, evaluation metrics, and the applications of these word embeddings in different NLP tasks.

History

Journal

ACM Transactions on Asian and Low-Resource Language Information Processing

Volume

20

Issue

5

Pagination

1 - 35

Publisher

Association for Computing Machinery

Location

New York, N.Y.

Publisher DOI

https://doi.org/10.1145/3434237

ISSN

2375-4699

eISSN

2375-4702

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Usage metrics

Keywords

Text mining natural language processing word representation language models Artificial Intelligence and Image Processing

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models

History

Journal

Volume

Issue

Pagination

Publisher

Location

Publisher DOI

ISSN

eISSN

Language

Publication classification

Usage metrics

Categories

Keywords

Licence

Exports