You are not logged in.

Video genre categorization using audio wavelet coefficients

Dinh, Phung Quoc, Dorai, Chitra and Venkatesh, Svetha 2002, Video genre categorization using audio wavelet coefficients, in ACCV 2002 : Proceedings of the 5th Asian Conference on Computer Vision, Asian Federation of Computer Vision Societies, [Tokyo, Japan], pp. 69-74.

Attached Files
Name Description MIMEType Size Downloads

Title Video genre categorization using audio wavelet coefficients
Author(s) Dinh, Phung QuocORCID iD for Dinh, Phung Quoc orcid.org/0000-0002-9977-8247
Dorai, Chitra
Venkatesh, Svetha
Conference name Asian Conference on Computer Vision (5th : 2002 : Melbourne, Vic.)
Conference location Melbourne, Vic.
Conference dates 22-25 Jan. 2002
Title of proceedings ACCV 2002 : Proceedings of the 5th Asian Conference on Computer Vision
Editor(s) Suter, D.
Bab-Hadiashar, A.
Publication date 2002
Conference series Asian Conference on Computer Vision
Start page 69
End page 74
Total pages 6
Publisher Asian Federation of Computer Vision Societies
Place of publication [Tokyo, Japan]
Keyword(s) wavelet
wavelet-based audio features
fourier
audio signal
automatic program genre detection
Summary In this paper, we investigate the use of a wavelet transform-based analysis of audio tracks accompanying videos for the problem of automatic program genre detection. We compare the classification performance based on wavelet-based audio features to that using conventional features derived from Fourier and time analysis for the task of discriminating TV programs such as news, commercials, music shows, concerts, motor racing games, and animated cartoons. Three different classifiers namely the Decision Trees, SVMs, and k-Nearest Neighbours are studied to analyse the reliability of the performance of our wavelet features based approach. Further, we investigate the issue of an appropriate duration of an audio clip to be analyzed for this automatic genre determination. Our experimental results show that features derived from the wavelet transform of the audio signal can very well separate the six video genres studied. It is also found that there is no significant difference in performance with varying audio clip durations across the classifiers.
Notes Papers will be published in Springer's Lecture Notes in Computer Science.
ISBN 0958025606
9780958025607
Language eng
Field of Research 089999 Information and Computing Sciences not elsewhere classified
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category E1.1 Full written paper - refereed
Copyright notice ©2002, Springer
Persistent URL http://hdl.handle.net/10536/DRO/DU:30044836

Document type: Conference Paper
Collection: School of Information Technology
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 252 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Tue, 01 May 2012, 11:01:25 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.