Improving quality Of statistical parametric speech synthesis using sonority information

Sharma, Bidisha

Improving quality Of statistical parametric speech synthesis using sonority information

Files

Abstract-TH-1917_136102017.pdf (65.07 KB)

TH-1917_136102017.pdf (5.16 MB)

Date

2018

Authors

Sharma, Bidisha

Abstract

This thesis aims towards improving naturalness and intelligibility of synthesized speech obtained from statistical parametric speech synthesis (SPSS). Along with the conventional source and spectral information, some additional significant features can also be derived from the speech signal to preserve its characteristics in parametric form. The sonority information represents spectral prominence, higher energy and periodicity aspects, which are related to human speech perception, that change with the varying vocal-tract constriction and glottal source amplitude during speech production. Therefore, this information is extracted from the speech signal in terms of sonority feature. It is capable to delineate the degree of sonority associated with a sound unit. The sonority feature is incorporated in the SPSS framework to use it in the studies related to this thesis.To alleviate the over-smoothing effect from parameter sequences generated from SPSS, post-filtering mechanisms are found to be effective. By considering the fact that the characteristics of the speech parameters may extensively vary based on the broad categories of sound units, a class based dynamic post-filtering method is proposed. The excitation source (fundamental frequency and strength of excitation (SoE)) and spectral parameters (sharpness of peaks and valleys of the spectrum) corresponding to each frame are enhanced using post-filtering factors that change with sonorant sound categories. The sonorant class information is derived from a support vector machine based classifier trained using sonority feature associated with each frame. This method improves the temporal variation, fine spectral structure as well as reduces the deviation with the natural counterpart leading to improvement in synthesized speech quality.

Description

Prasanna, S R Mahadeva

Keywords

ELECTRONICS AND ELECTRICAL ENGINEERING

URI

https://gyan.iitg.ac.in/handle/123456789/1120

Collections

PhD Theses (Electronics and Electrical Engineering)

Full item page

Gyan-IR

Improving quality Of statistical parametric speech synthesis using sonority information

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By