Deakin University
Browse

Wikipedia vandal early detection: from user behavior to user embedding

conference contribution
posted on 2018-01-01, 00:00 authored by S Yuan, P Zheng, X Wu, Y Xiang
© 2017, Springer International Publishing AG. Wikipedia is the largest online encyclopedia that allows anyone to edit articles. In this paper, we propose the use of deep learning to detect vandals based on their edit history. In particular, we develop a multi-source long-short term memory network (M-LSTM) to model user behaviors by using a variety of user edit aspects as inputs, including the history of edit reversion information, edit page titles and categories. With M-LSTM, we can encode each user into a low dimensional real vector, called user embedding. Meanwhile, as a sequential model, M-LSTM updates the user embedding each time after the user commits a new edit. Thus, we can predict whether a user is benign or vandal dynamically based on the up-to-date user embedding. Furthermore, those user embeddings are crucial to discover collaborative vandals. Code and data related to this chapter are available at: https://bitbucket.org/bookcold/vandal_detection.

History

Volume

10534

Pagination

832-846

Location

Skopje, Macedonia

Start date

2017-09-18

End date

2017-09-22

ISSN

0302-9743

eISSN

1611-3349

ISBN-13

9783319712482

Language

eng

Publication classification

E Conference publication, E1.1 Full written paper - refereed

Copyright notice

2017, Springer International Publishing AG

Title of proceedings

ECML PKDD 2017 : Proceedings, Part I : Machine Learning and Knowledge Discovery in Databases

Event

Machine Learning and Knowledge Discovery in Databases. Joint European Conference (2017 : Skopje, Macedonia)

Publisher

Springer

Place of publication

Cham, Switzerland

Series

Lecture Notes in Computer Science

Usage metrics

    Research Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC