posted on 2019-01-01, 00:00authored byHung Vu, Dinh Nguyen Tu, Le Trung, Wei LuoWei Luo, Phung Dinh
Detecting anomalies in surveillance videos has long been an important but unsolved problem. In particular, many existing solutions are overly sensitive to (often ephemeral) visual artifacts in the raw video data, resulting in false positives and fragmented detection regions. To overcome such sensitivity and to capture true anomalies with semantic significance, one natural idea is to seek validation from abstract representations of the videos. This paper introduces a framework of robust anomaly detection using multilevel representations of both intensity and motion data. The framework consists of three main components: 1) representation learning using Denoising Autoencoders, 2) level-wise representation generation using Conditional Generative Adversarial Networks, and 3) consolidating anomalous regions detected at each representation level. Our proposed multilevel detector shows a significant improvement in pixel-level Equal Error Rate, namely 11.35%, 12.32% and 4.31% improvement in UCSD Ped 1, UCSD Ped 2 and Avenue datasets respectively. In addition, the model allowed us to detect mislabeled anomalies in the UCDS Ped 1.
Proceedings of the Combined Conferences : 33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence
Event
AAAI Conference on Artificial Intelligence., Innovative Applications of Artificial Intelligence Conference and AAAI Symposium on Educational Advances in Artificial Intelligence. Combined Conference (2019 : 33rd, 31st & 9th : Honolulu, Hawaii)