Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection

Song, L; Lau, R Y K; Kwok, R C W; Mirkovski, Kristijan; Dou, W

File(s) under permanent embargo

Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection

journal contribution

posted on 2017-03-01, 00:00 authored by L Song, R Y K Lau, R C W Kwok, Kristijan MirkovskiKristijan Mirkovski, W Dou

With the rise of social web, there has also been a great concern about the quality of user-generated content on social media sites (SMSs). Deceptive comments harm users’ trust in online social media and cause financial loss to firms. Previous studies use various features and classification algorithms to detect and filter social spam on several social media platforms. However, to the best of our knowledge, previous studies have not exploited both probabilistic topic modeling and incremental learning to detect social spam on SMSs. Thus, the main contribution of this paper is design of a novel detection methodology that combines topic- and user-based features to improve the effectiveness of social spam detection. The proposed methodology exploits a probabilistic generative model, namely the labeled latent Dirichlet allocation (L-LDA), for mining the latent semantics from user-generated comments, and an incremental learning approach for tackling the changing feature space. An experiment based on a large dataset extracted from YouTube demonstrates the effectiveness of our proposed methodology, which achieves an average accuracy of 91.17 % in social spam detection. Our statistical analysis reveals that topic-based features significantly improve social spam detection, which has significant implications for business practice.

History

Journal

Electronic commerce research

Volume

17

Issue

1

Pagination

51 - 81

Publisher

Springer

Location

New York, N.Y.

Publisher DOI

https://doi.org/10.1007/s10660-016-9244-5

ISSN

1389-5753

eISSN

1572-9362

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal; C Journal article

Copyright notice

2016, Springer Science+Business Media New York

Usage metrics

Keywords

social spam spam detection topic modeling incremental learning machine learning big data Social Sciences Business Management Business & Economics CLASSIFICATION SYSTEM MODEL Information Systems

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection

History

Journal

Volume

Issue

Pagination

Publisher

Location

Publisher DOI

ISSN

eISSN

Language

Publication classification

Copyright notice

Usage metrics

Categories

Keywords

Licence

Exports