Speaker verification using sufficient train and limited test data
No Thumbnail Available
The thesis focuses on speaker verification (SV) from the perspective of application oriented systems and identifies a framework of sufficient train with limited test data as the favorable one. Three different directions are highlighted that have scope towards improving the system performance with limited test data based scenario. These directions are investigated in detail and a combined system is proposed including the conducted explorations. The source features provide information about the glottal excitation in the form of pitch period, strength of excitation, glottal signal shape, etc. Since the glottis and associated muscle structure are unique for each individual, the information represented by the source features is expected to be distinct for each speaker and can be utilized for SV. Three source features namely mel power difference of spectrum in subbands (MPDSS), residual mel frequency cepstral coefficient (RMFCC) and discrete cosine transform of integrated linear prediction residual (DCTILPR) are explored and their significances for limited test data cases are demonstrated. The source features are found to capture different attributes of source information, which on fusion provides comparable performance to the conventional mel frequency cepstral coefficient (MFCC) based vocal tract features.
Supervisor: S. R. Mahadeva Prasanna
ELECTRONICS AND ELECTRICAL ENGINEERING