Paper
10 January 2003 Procedure for audio-assisted browsing of news video using generalized sound recognition
Ajay Divakaran, Regunathan Radhakrishnan, Ziyou Xiong, Michael Casey
Author Affiliations +
Proceedings Volume 5021, Storage and Retrieval for Media Databases 2003; (2003) https://doi.org/10.1117/12.476294
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
In Casey describes a generalized sound recognition framework based on reduced rank spectra and Minimum-Entropy Priors. This approach enables successful recognition of a wide variety of sounds such as male speech, female speech, music, animal sounds etc. In this work, we apply this recognition framework to news video to enable quick video browsing. We identify speaker change positions in the broadcast news using the sound recognition framework. We combine the speaker change position with color & motion cues from video and are able to locate the beginning of each of the topics covered by the news video. We can thus skim the video by merely playing a small portion starting from each of the locations where one of the principal cast begins to speak. In combination with our motion-based video browsing approach, our technique provides simple automatic news video browsing. While similar work has been done before, our approach is simpler and faster than competing techniques, and provides a rich framework for further analysis and description of content.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ajay Divakaran, Regunathan Radhakrishnan, Ziyou Xiong, and Michael Casey "Procedure for audio-assisted browsing of news video using generalized sound recognition", Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, (10 January 2003); https://doi.org/10.1117/12.476294
Lens.org Logo
CITATIONS
Cited by 6 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Feature extraction

Optical tracking

Semantic video

Video compression

Computing systems

Databases

Back to Top