Glottal activity region based processing for speech synthesis

Adiga, Nagaraj

Glottal activity region based processing for speech synthesis

Files

TH-1840_11610235.pdf (4.84 MB)

Abstract-TH-1840_11610235.pdf (110.43 KB)

Synopsis-TH-1840_11610235.pdf (614.04 KB)

Date

2017

Authors

Adiga, Nagaraj

Abstract

Statistical parametric speech synthesis (SPSS) is the mostly preferred synthesizer compared to concatenative synthesis system, due to small footprint and flexibility. However, the naturalness and intelligibility of SPSS are still lagging behind the concatenative synthesis system. In this thesis, glottal activity region based processing for speech synthesis is proposed to improve the quality of speech. Glottal activity regions are perceptually important and constitute the majority of speech sounds. The major contributions of the present thesis are (I) Glottal activity region detection using features like strength of excitation, normalized autocorrelation peak strength, and higher order statistics. (ii) Vocal-tract smoothed spectral envelope computation by applying Riesz transform in the 2-D domain. (iii) Source model is designed with representation for aperiodic and phase components using integrated LP residual. (iv) The combination of suprasegmental, system, and source features for modeling together in SPSS to improves the prosody, naturalness, and intelligibility of SPSS.

Description

Prasanna, S R Mahadeva

Keywords

ELECTRONICS AND ELECTRICAL ENGINEERING

URI

https://gyan.iitg.ac.in/handle/123456789/1038

Collections

PhD Theses (Electronics and Electrical Engineering)

Full item page

Gyan-IR

Glottal activity region based processing for speech synthesis

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By