Deakin University
Browse

File(s) under permanent embargo

Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection

journal contribution
posted on 2017-03-01, 00:00 authored by L Song, R Y K Lau, R C W Kwok, Kristijan MirkovskiKristijan Mirkovski, W Dou
With the rise of social web, there has also been a great concern about the quality of user-generated content on social media sites (SMSs). Deceptive comments harm users’ trust in online social media and cause financial loss to firms. Previous studies use various features and classification algorithms to detect and filter social spam on several social media platforms. However, to the best of our knowledge, previous studies have not exploited both probabilistic topic modeling and incremental learning to detect social spam on SMSs. Thus, the main contribution of this paper is design of a novel detection methodology that combines topic- and user-based features to improve the effectiveness of social spam detection. The proposed methodology exploits a probabilistic generative model, namely the labeled latent Dirichlet allocation (L-LDA), for mining the latent semantics from user-generated comments, and an incremental learning approach for tackling the changing feature space. An experiment based on a large dataset extracted from YouTube demonstrates the effectiveness of our proposed methodology, which achieves an average accuracy of 91.17 % in social spam detection. Our statistical analysis reveals that topic-based features significantly improve social spam detection, which has significant implications for business practice.

History

Journal

Electronic commerce research

Volume

17

Issue

1

Pagination

51 - 81

Publisher

Springer

Location

New York, N.Y.

ISSN

1389-5753

eISSN

1572-9362

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal; C Journal article

Copyright notice

2016, Springer Science+Business Media New York