CCTV scene perspective distortion estimation from low-level motion features

Arandjelovic, Ognjen, Pham, Duc-Son and Venkatesh, Svetha 2016, CCTV scene perspective distortion estimation from low-level motion features, IEEE transactions on circuits and systems for video technology, vol. 26, no. 5, pp. 939-949, doi: 10.1109/TCSVT.2015.2424055.

Attached Files
Name Description MIMEType Size Downloads

Title CCTV scene perspective distortion estimation from low-level motion features
Author(s) Arandjelovic, Ognjen
Pham, Duc-Son
Venkatesh, Svetha
Journal name IEEE transactions on circuits and systems for video technology
Volume number 26
Issue number 5
Start page 939
End page 949
Total pages 11
Publisher IEEE
Place of publication Piscataway, N.J.
Publication date 2016-05
ISSN 1051-8215
Keyword(s) camera
Summary Our aim is to estimate the perspective-effected geometric distortion of a scene from a video feed. In contrast to most related previous work, in this task we are constrained to use low-level spatiotemporally local motion features only. This particular challenge arises in many semiautomatic surveillance systems that alert a human operator to potential abnormalities in the scene. Low-level spatiotemporally local motion features are sparse (and thus require comparatively little storage space) and sufficiently powerful in the context of video abnormality detection to reduce the need for human intervention by more than 100-fold. This paper introduces three significant contributions. First, we describe a dense algorithm for perspective estimation, which uses motion features to estimate the perspective distortion at each image locus and then polls all such local estimates to arrive at the globally best estimate. Second, we also present an alternative coarse algorithm that subdivides the image frame into blocks and uses motion features to derive block-specific motion characteristics and constrain the relationships between these characteristics, with the perspective estimate emerging as a result of a global optimization scheme. Third, we report the results of an evaluation using nine large sets acquired using existing closed-circuit television cameras, not installed specifically for the purposes of this paper. Our findings demonstrate that both proposed methods are successful, their accuracy matching that of human labeling using complete visual data (by the constraints of the setup unavailable to our algorithms).
Language eng
DOI 10.1109/TCSVT.2015.2424055
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category C1 Refereed article in a scholarly journal
ERA Research output type C Journal article
Copyright notice ©2015, IEEE
Persistent URL

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 6 times in TR Web of Science
Scopus Citation Count Cited 5 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 413 Abstract Views, 4 File Downloads  -  Detailed Statistics
Created: Thu, 24 Nov 2016, 14:45:19 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact