Asymmetric self-learning for tackling Twitter spam drift

Chen, C; Zhang, J; Xiang, Y; Zhou, W

Asymmetric self-learning for tackling Twitter spam drift

conference contribution

posted on 2024-06-06, 00:27 authored by C Chen, J Zhang, Y Xiang, W Zhou

Spam has become a critical problem on Twitter. In order to stop spammers, security companies apply blacklisting services to filter spam links. However, over 90% victims will visit a new malicious link before it is blocked by blacklists. To eliminate the limitation of blacklists, researchers have proposed a number of statistical features based mechanisms, and applied machine learning techniques to detect Twitter spam. In our labelled large dataset, we observe that the statistical properties of spam tweets vary over time, and thus the performance of existing ML based classifiers are poor. This phenomenon is referred as 'Twitter Spam Drift'. In order to tackle this problem, we carry out deep analysis of 1 million spam tweets and 1 million non-spam tweets, and propose an asymmetric self-learning (ASL) approach. The proposed ASL can discover new information of changed tweeter spam and incorporate it into classifier training process. A number of experiments are performed to evaluate the ASL approach. The results show that the ASL approach can be used to significantly improve the spam detection accuracy of using traditional ML algorithms.

History

Volume

2015-August

Pagination

208-213

Location

Hong Kong

Publisher DOI

https://doi.org/10.1109/INFCOMW.2015.7179386

Start date

2015-04-26

End date

2015-05-01

ISSN

0743-166X

ISBN-13

9781467371315

Language

eng

Publication classification

E Conference publication, E1 Full written paper - refereed

Copyright notice

2015, IEEE

Title of proceedings

INFOCOM WKSHPS 2015: Proceedings of the Computer Communications Workshops

Event

IEEE Conference on Computer Communications Workshops (2015 : Hong Kong)

Publisher

IEEE

Place of publication

Piscataway, N.J.

Usage metrics

Keywords

080303 Computer System Security 970108 Expanding Knowledge in the Information and Computing Sciences School of Information Technology

Asymmetric self-learning for tackling Twitter spam drift

History

Volume

Pagination

Location

Publisher DOI

Start date

End date

ISSN

ISBN-13

Language

Publication classification

Copyright notice

Title of proceedings

Event

Publisher

Place of publication

Usage metrics

Categories

Keywords

Licence

Exports