In this paper, we study the sound tracks in films and their indexical semiotic usage by developing a classification system that detects complex sound scenes and their constituent sound events in cinema. We investigate two main issues in this paper: Determination of what constitutes the presence of a high level sound scene and inferences about the thematic content of the scene that can be drawn from this presence, and classification of environmental sounds in the audio track of the scene, to assist in the automatic detection of the high level scene. Experiments with our classification system on pure sounds resulted in a correct event classification rate of 88.9%. When the audio content of a number of film scenes was examined, though a lower accuracy resulted with sound event detection due to the presence of mixed sounds, the film audio samples were generally classified with the correct high-level sound scene label, enabling correct inferences about the story content of the scenes.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
Publication classification
E1.1 Full written paper - refereed
Copyright notice
2001, IEEE
Title of proceedings
ICME 2001 : Proceedings of the IEEE International Conference on Multimedia and Expo