The identification of useful structures in home video is difficult because this class of video is distinguished from other video sources by its unrestricted, non edited content and the absence of regulated storyline. In addition, home videos contain a lot of motion and erratic camera movements, with shots of the same character being captured from various angles and viewpoints. In this paper, we present a solution to the challenging problem of clustering shots and faces in home videos, based on the use of SIFT features. SIFT features have been known to be robust for object recognition; however, in dealing with the complexities of home video setting, the matching process needs to be augmented and adapted. This paper describes various techniques that can improve the number of matches returned as well as the correctness of matches. For example, existing methods for verification of matches are inadequate for cases when a small number of matches are returned, a common situation in home videos. We address this by constructing a robust classifier that works on matching sets instead of individual matches, allowing the exploitation of the geometric constraints between matches. Finally, we propose techniques for robustly extracting target clusters from individual feature matches.
History
Pagination
636 - 648
Location
Singapore, Singapore
Start date
2007-01-09
End date
2007-01-12
ISBN-13
9783540694212
ISBN-10
3540694218
Language
eng
Publication classification
E1.1 Full written paper - refereed
Copyright notice
2007, Springer-Verlag Berlin, Heidelberg
Editor/Contributor(s)
T Cham, J Cai, C Dorai, D Rajan, T Chua
Title of proceedings
MMM'07 : Advances in multimedia modeling : Proceedings of the 13th International Multimedia Modeling Conference