Statistical detection of online drifting twitter spam

Liu, Shigang, Zhang, Jun and Xiang, Yang 2016, Statistical detection of online drifting twitter spam, in ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security, ACM, New York, N. Y., pp. 1-10, doi: 10.1145/2897845.2897928.

Attached Files
Name Description MIMEType Size Downloads

Title Statistical detection of online drifting twitter spam
Author(s) Liu, Shigang
Zhang, JunORCID iD for Zhang, Jun
Xiang, YangORCID iD for Xiang, Yang
Conference name Computer and Communications Security. Conference (11th : 2016 : Xi'an, China)
Conference location Xi'an, China
Conference dates 30 May- 3 June. 2016
Title of proceedings ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security
Editor(s) [Unknown]
Publication date 2016
Conference series ACM on Asia Conference on Computer and Communications Security
Start page 1
End page 10
Total pages 10
Publisher ACM
Place of publication New York, N. Y.
Keyword(s) Twitter spam detection
social network security
security data analytics
Summary Spam has become a critical problem in online social networks. This paper focuses on Twitter spam detection. Recent research works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets. We observe existing machine learning based detection methods suffer from the problem of Twitter spam drift, i.e., the statistical properties of spam tweets vary over time. To avoid this problem, an effective solution is to train one twitter spam classifier every day. However, it faces a challenge of the small number of imbalanced training data because labelling spam samples is time-consuming. This paper proposes a new method to address this challenge. The new method employs two new techniques, fuzzy-based redistribution and asymmetric sampling. We develop a fuzzy-based information decomposition technique to re-distribute the spam class and generate more spam samples. Moreover, an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data. Finally, we apply the ensemble technique to combine the spam classifiers over two different training sets. A number of experiments are performed on a real-world 10-day ground-truth dataset to evaluate the new method. Experiments results show that the new method can significantly improve the detection performance for drifting Twitter spam.
ISBN 9781450342339
Language eng
DOI 10.1145/2897845.2897928
Field of Research 080303 Computer System Security
080503 Networking and Communications
080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category E1 Full written paper - refereed
ERA Research output type E Conference publication
Copyright notice ©2016, ACM
Persistent URL

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 23 times in TR Web of Science
Scopus Citation Count Cited 14 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 293 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Thu, 22 Sep 2016, 12:27:10 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact