Epoch based dynamic prosody modification for neutral to expressive conversion
No Thumbnail Available
The objective of this thesis is to address the issues in the analysis, estimation and incor- poration of prosodic parameters for neutral to expressive speech conversion. The prosodic parameters like instantaneous pitch, duration and strength of excitation are used as the expression dependent parameters. For the expressive speech analysis, refinements in the conventional methods are proposed to accurately estimate the prosodic parameters from different expressions. The variations in the prosodic parameters for different expressions are compared with respect to the neutral expression. The expressive speech is synthesized by modifying the prosodic parameters of the neutral speech according to the variations in the target expression. The variations in the prosodic parameters are incorporated by epoch based prosody modification. Epochs represent the instants of glottal closure in voiced speech and onset of burst or frication in unvoiced speech. The improved perceptual quality in the prosody modified speech is obtained by accurately estimating epochs loca- tion in epoch based prosody modification. A computationally efficient and perceptually improved epoch based prosody modification is initially proposed for incorporating static prosodic variations for different expressions. As the prosodic parameters of the expressions vary dynamically with respect to the corresponding neutral speech, an epoch based dy- namic prosody modification method is then proposed for incorporating dynamic variations in the prosodic parameters. Finally, the significance of dynamic prosody modification is demonstrated and evaluated for neutral to expressive speech conversion for text dependent and speaker dependent, text dependent and speaker independent and text independent and speaker independent scenarios..
Supervisor: S.R.M. Prasanna
ELECTRONICS AND ELECTRICAL ENGINEERING