Spectral Analysis of Stressed Speech for Speech Recognition
No Thumbnail Available
The objective of this thesis is to analyze the stress information in the spectral features of stressed speech. The analysis of stress is focused in the frequency domain, with specific emphasis on various sub-areas in representing this structure-spectrum, subband, and cepstrum. The investigation of stress information includes recognition of speech under stressed condition. In this thesis, four problems of stressed speech recognition are dealt. The first problem deals with the development and evaluation of a stressed speech database. The stress and speech information present in the database are validated by evaluating the stress class and speech information present in the utterances. The stress and speech information are evaluated perceptually as well as by using automatic methods for stress classification and speech recognition, respectively. Under stressed condition, migration of spectral energy takes place from the lower frequency to the higher frequency. The migration of spectral energy effects the spectral tilt and the subband energy of the speech signal. This has been reported in the literature. Compared to the source, the formants are less affected due to stress. As a part of the second problem, this has been revisited. The conventional method for computation of spectral tilt captures the gross spectral energy information of the speech signal. In the present work, relative formant peak displacement (RFD) is proposed to quantify this variation in formant peaks. The RFD values of second, third and fourth formant peaks are computed as relative displacements of these formant peaks from the first formant peak. A stress classifier is developed to investigate the stress information in the RFD feature..
Supervisor: S. Dandapat and S.R. Mahadeva Prasanna
ELECTRONICS AND ELECTRICAL ENGINEERING