Exploration of sparse representation techniques in language recognition

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
The work presented in this thesis investigates different approaches for achieving efficient sparse representation for the purpose of language recognition (LR) task. In this context, both simple and discriminatively learned dictionaries have been explored employing different kinds of regularization. The presented approaches are contrasted with state-of-the-art i-vector representation of speech utterance along with LDA/WCCN session/channel compensation based LR approach. Towards addressing high computational complexity as well as memory requirement associated with the i-vector approach, a low complexity ensemble of random subspaces of GMM-mean supervector approach has been developed. To further enhance the LR performance, the ensemble of random subspaces based approach has been extended to joint factor analysis (JFA) compensated GMM-mean supervectors. The overall complexity and run-time of the combined system is found to be lower than that of the i-vector based LR system. The sparse representation techniques on JFA latent vector has also been explored. For achieving the diversity gain, an ensemble of learned-exemplar (discriminative) dictionary has been proposed. This approach is noted for improved LR with further reduced latency. Finally, the i-vector representations derived from deep neural network based bottleneck features have also been explored in the sparse coding based LR paradigm
Supervisor: Rohit Sinha