Structural processing methods for speech signal analysis

Bhagath, Parabattina

Structural processing methods for speech signal analysis

dc.contributor.author	Bhagath, Parabattina
dc.date.accessioned	2021-10-22T06:18:49Z
dc.date.accessioned	2023-10-20T04:37:06Z
dc.date.available	2021-10-22T06:18:49Z
dc.date.available	2023-10-20T04:37:06Z
dc.date.issued	2020
dc.description	Das, Pradip K	en_US
dc.description.abstract	Speech signal analysis is a crucial study that helps to develop methods for problems like phoneme segmentation, speech recognition, speaker verification, etc. There are various frameworks and techniques that support these problems. Frameworks like Hidden Markov Modeling and Deep Learning are popular. The frameworks are efficient with large data sets where intensive training is possible. However, this becomes challenging in case of underresourced language since sufficient data cannot be provided for the intensive training. To address the needs of these languages, suitable methods are required with the capability to seek for significant clues with less amount of data. Structural processing methods focus on understanding the signals differently compared to signal processing methods. In this approach, a signal is treated as an image rather than a time series with different samples at different time stamps. The need for these methods arises due to the limitations in Hidden Markov Models. HMM contains states in which each state depends on at most two neighboring states. This limit HMM to have a holistic view of the entire signal. Recent developments in graph signal processing techniques give a way to analyze the signals by using graph data structures. These methods enable to use combination of temporal relations and frequency components while modeling the signals. The thesis addresses the problems of speech characterization and segmentation while considering the above-mentioned issues. Different features like trajectories and Tree structures are proposed and found to be useful for modeling speech signals that can be used further for recognition. Three different features based on trajectories, graph structures and fractals are proposed for segmentation task. The experiments were conducted on Indian accented spoken English vowels, words and TIMIT sentence data. Tree structures and trajectories were found to be useful in characterizing vowels and words, respectively. In the phoneme segmentation experiments, words data were collected from people belonging to different regions of India. The segmentation approaches are ascertained to be appropriate for finding phoneme boundaries of phonetic units in spoken words and sentences. The algorithms and obtained results are discussed in the thesis.	en_US
dc.identifier.other	ROLL NO.146101017
dc.identifier.uri	https://gyan.iitg.ac.in/handle/123456789/1949
dc.language.iso	en	en_US
dc.relation.ispartofseries	TH-2501;
dc.subject	COMPUTER SCIENCE AND ENGINEERING	en_US
dc.title	Structural processing methods for speech signal analysis	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Abstract-TH-2501_146101017.pdf
Size:: 217.71 KB
Format:: Adobe Portable Document Format
Description:: ABSTRACT

Download

Name:: TH-2501_146101017.pdf
Size:: 38.93 MB
Format:: Adobe Portable Document Format
Description:: THESIS

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

PhD Theses (Computer Science and Engineering)