Analysis of Speech and Music Content for Movie Genre Classification
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Movies are a popular mode of entertainment around the world. The consistent rise in the production and consumption of movies demands more efficient automatic movie content analysis applications. Movie Genre Classification (MGC) is vital for underage censorship, search, retrieval, and targeted publicity. Current trends in MGC literature indicate a focus on short trailers instead of full movies and a multimodal approach. The audio modality is generally used only as an auxiliary channel. However, due to its rich genre-specific information, the audio signal deserves a dedicated study in the current context. Hence, this thesis aims to perform only audio-specific MGC. The thesis has four principal contributions. First, spectral peak tracking-based magnitude spectrum features are proposed for isolated speech and music classification. Second, the underexplored phase component of the audio signals is utilized for discriminating speech and music. The third contribution involves using harmonic-percussive sourceseparated features and classifiers in the multi-task learning framework for identifying speech overlapped with music. Finally, the above proposals are employed for the MGC task. The spectral peak trackingbased method performs better than the other proposals and the baselines. Specific combinations of all the proposed and baseline features provide the overall best performance, even in the cross-dataset scenario. The thesis work can be extended in the future by analyzing the individual constituents of speech and music for a more nuanced representation of movie genres.
Description
Supervisors: Guha, Prithwijit and Prasanna, S R Mahadeva
Keywords
Movie Genre Classification, Speech Music Classification, Audio Signal Processing, Magnitude Spectrum Features, Phase Spectrum Features, Harmonic Percussive Source Separation, Multi-task Learning, Time and Feature Attention