PhD Theses (Electronics and Electrical Engineering)
Browse
Recent Submissions
Item Exploration of Novel approaches for offline writer identification using handwritten words(2024) Kumar, VineetThis thesis presents innovative approaches for offline handwritten word image author identification, leveraging various deep learning techniques. The first work employs feature maps from pre-trained CNN layers to capture writer-specific characteristics. Key-point regions are first detected using the SIFT algorithm across different abstractions like characters and their combinations. These regions are processed through a CNN, producing feature maps that are then represented using a modified HOG feature descriptor. A unique contribution lies in extracting additional cues from these feature maps through a saliency measure derived using Sparse Principal Component Analysis (SPCA). The saliency scores are integrated with HOG features to create customized descriptors, which are then classified using SVMs to determine the identity of the writer.Item Design of Cryptographic Primitives for Wireless Communication and Blockchain Mining(2024) Goswami, Sushree Sila PThe rising reliance on the internet across various sectors has heightened the importance of security measures, given the potential threat posed by cyber attackers who could corrupt or misuse data. This thesis explores the implementation of diverse cryptographic algorithms—DES, RSA, AES, ECC, and ECCDH—on FPGA (Field Programmable Gate Array). In secure wireless communications, stream ciphers are preferred for their hardware implementation simplicity. The design of stream ciphers generally involves using a pseudorandom number generator to produce a keystream, which masks the plaintext through a XOR operation, resulting in cipher text. This research presents the realization of these designs using Verilog Hardware Description Language and their implementation on FPGA. Experimental results indicate that a modified SNOW 2.0 architecture is 13% more resource-efficient and 19% more efficient overall compared to the traditional SNOW 2.0, and 104% more efficient than existing architectures. Security is paramount in electronic communication, particularly in wireless networks like LTE, where cryptographic algorithms are vital for protecting sensitive data. While software implementations are straightforward, they often lack the speed required for real-time communication devices, necessitating hardware implementations of cryptographic processors. This thesis introduces a novel SNOW3G crypto processor for 4G LTE security, optimized for area, power, and efficiency. Implemented on the Zynq ZC702 FPGA, this design uses only 0.31% of available area and achieves significant efficiency and low power consumption, making it suitable for mobile devices.Item Design and Implementation of Continuous Flow FFT Processors for OFDM in Wireless GigaHertz Standards(2023) Agarwal, SumitThe throughput requirement of latest OFDM based IEEE 802.11ay WLAN standard is between 20 to 40 Gbps. Also, the FFT processor needed for OFDM must work in continuous mode for real time communication. The number of FFT points can be variable. In this thesis, we propose a Continuous flow architecture for a 512 point FFT to meet this requirement at about 28 Gbps. The number of points are kept fixed here to illustrate the essential features of the design. They can be varied, if desired, with minimal architectural modifications. Architectures to meet the throughput requirement (10 Gbps) of earlier WLAN standard IEEE 802.11ad have been reported in the literature. The proposed architecture achieves more than double this throughput at 28 Gbps with similar chip area and clock as the best existing 10 Gbps designs. This is made possible through a specialized design for OFDM unlike the earlier FFT chips which were designed for general purpose FFT. The proposed architecture uses two radix-16 and one radix-2 stages to meet the high throughput requirement. Standard continuous flow (CF) FFT designs use two memories. The proposed design exploits the smaller wordlength of 4 bit (for 64 QAM) of OFDM to introduce an additional smaller input memory and a simpler processing element (PE) for the input stage. Combined with the existing two memories, there are now three memories for the three stage FFT. Thus this design allows memories to assume dedicated roles for each stage. Compared to the existing practice of switching of memories, dedicated memories need a novel addressing scheme to maintain CF as data is replaced in same memory rather than switching the memories.Item Design of Low Power VLSI Architectures for Machine Learning Based Wearable Healthcare Devices(2023) Janveja, MeenaliAccording to the World Health Organization (WHO), cardiovascular diseases cause approximately 17.9 million fatalities yearly, which is estimated to be 31% of the global mortality rate. An electrocardiogram (ECG) is a biosignal that provides information on the patient’s heart’s electrical activity. ECG enables the diagnosis of various cardiac abnormalities, from acute coronary syndrome to cardiac arrhythmias. Therefore, ECG monitoring in daily life is necessary for early diagnosis of heart disease. Hardware and software developments have led to the development of machine learning enabled wearable healthcare devices, such as smartwatches and chest patches, which can continuously monitor cardiac functioning easily. The wearable devices provide critical alerts for events that require prompt medical attention or hospitalization, making them highly efficient and practical. A conventional wearable device has three primary modules. The first module is the sensors and analog front, responsible for acquiring the ECG signals and converting them to digital samples. The second module consists of an ECG co-processor, incorporating a feature extraction block and a machine learning-based classifier responsible for ECG signal analysis and classification of cardiovascular diseases. The final module comprises data compression and transmitter blocks, which transmit ECG data and the classifier output to the cloud servers. In wearable devices, battery life is critical because most devices monitor ECG continuously. Further, these devices should be small and easy to use. Therefore, area and power-optimized algorithms and their VLSI architectures are required for continuous monitoring of ECG on wearable devices. Thus, we present optimized ECG signal processing algorithms and their low-power and resource-efficient VLSI architectures for cardiovascular disease detection, such as cardiac arrhythmia and myocardial infarction, for wearable devices.Item Automated Diagnosis of Heart Valve Diseases from Phonocardiogram Signals using Deep Learning(2023) Das, SamarjeetHeart valve diseases (HVDs) are the primary causes of mortality in developing and underdeveloped countries. Early detection of HVDs is essential to avoid lethal heart diseases due to the disease’s progression. Phonocardiogram (PCG) signal provides a non-invasive and cost-effective tool that helps with the preliminary diagnosis of HVDs. However, the raw PCG signals are often susceptible to noise and artifacts. It degrades the signal quality and makes it challenging to diagnose HVDs manually. Furthermore, the wide variabilities in the PCG morphologies due to HVDs exhibit manual examination, often subjective and prone to human error. To address the above challenges, this dissertation focuses on developing automated deep-learning methods for diagnosing HVDs.Item Development of Kalman Filter based Algorithms for Fringe Pattern Analysis(2024) Sharma, ShikhaThe purpose of fringe pattern analysis is to retrieve the phase from the fringe pattern. The phase retrieval is essential from the fringe pattern in order to derive the object information. Therefore, demand for the phase information has promoted the development of fringe analysis techniques. Spatial fringe analysis techniques typically involve different operations such as fringe denoising, fringe normalization, and fringe pattern demodulation for the phase estimation. In some cases, phase aberration compensation is also required to be performed. The thesis presents a number of spatial fringe processing algorithms based on the application of Kalman filter.Item Analysis of Speech and Music Content for Movie Genre Classification(2023) Bhattacherjee, MrinmoyMovies are a popular mode of entertainment around the world. The consistent rise in the production and consumption of movies demands more efficient automatic movie content analysis applications. Movie Genre Classification (MGC) is vital for underage censorship, search, retrieval, and targeted publicity. Current trends in MGC literature indicate a focus on short trailers instead of full movies and a multimodal approach. The audio modality is generally used only as an auxiliary channel. However, due to its rich genre-specific information, the audio signal deserves a dedicated study in the current context. Hence, this thesis aims to perform only audio-specific MGC. The thesis has four principal contributions. First, spectral peak tracking-based magnitude spectrum features are proposed for isolated speech and music classification. Second, the underexplored phase component of the audio signals is utilized for discriminating speech and music. The third contribution involves using harmonic-percussive sourceseparated features and classifiers in the multi-task learning framework for identifying speech overlapped with music. Finally, the above proposals are employed for the MGC task. The spectral peak trackingbased method performs better than the other proposals and the baselines. Specific combinations of all the proposed and baseline features provide the overall best performance, even in the cross-dataset scenario. The thesis work can be extended in the future by analyzing the individual constituents of speech and music for a more nuanced representation of movie genres.Item Automatic Dialect Identification in Ao, a Low Resource Language(2023) Tzudir, MoakalaDialect Identification (DID) is a significant research problem widely explored in major languages like Arabic, Chinese, and Spanish. DID can serve as a frontend for many applications like Automatic Speech Recognition (ASR) that may require special dialect-specific enhancements for improved performance. This thesis proposes an automatic DID system for Ao, an under-resourced language of India. Ao is a Tibeto-Burman language spoken in Nagaland. It is a tonal language with three lexical tones: high, mid, and low. Chungli, Mongsen, and Changki are the three dialects of Ao that differ in their respective tone assignment on lexical words. Four principal contributions are made in this thesis. The first contribution of this thesis is creating a manually collected and annotated novel speech dataset to foster research on the Ao language. The second contribution of the thesis is a detailed acoustic study of the unexplored tone dynamics of the dialects of Ao. Based on the analysis, a tonal feature ($F_0$) to capture the dialect-specific tone information is proposed. The DID performance improves when the proposed tonal feature is combined with other spectral features. As the third contribution, this thesis explores three excitation source features in the DID task. The source features studied are Residual Mel Frequency Cepstral Coefficient (RMFCC), Integrated Linear Prediction Residual Log Mel Spectrogram (ILPR-LMS), and Linear Prediction (LP)-gammatonegram. A notable performance improvement is observed when the source information is combined with the vocal tract information. The fourth contribution of this thesis is the exploration of prosody-related characteristics of speech signals. The prosodic features are observed to provide significant performance improvements in classifying the dialects of Ao. The thesis work is concluded by combining all the proposed approaches to build an efficient DID system for Ao. Among many hurdles in studying under-resourced languages like Ao, the need for more data is the most prominent. Nevertheless, the contributions of this thesis may bridge some of those gaps and spur future research in this direction.Item Story Segmentation and Retrieval of News Videos in a Multi-modal Framework(2024) Haloi, PranabjyotiShot segmentation, categorization, indexing, and news story formation are the most important and primary steps in building an efficient and well-sorted video storage and retrieval system. News channels have evolved as one of the primary sources of information. However, in recent times, with the increase in the number of news channels, a plethora of news content is available on air, and it has become difficult to store and retrieve the news videos effectively. Commercials are also included in a news video, containing considerably less information. These commercials are to be filtered out, and the remaining news video will be segmented meaningfully. Segmentation of news videos is a crucial process for efficient storage and categorizing of the videos. The segmented stories also facilitate the easy retrieval and finding of the desired news. In this work, we developed different algorithms for shot segmentation, categorization, indexing, and retrieval of news videos. Our methods are independent of different temporal and spatial structures of various news channels and require a minimal manual input.Item Graph based Classi cation Techniques for Pig Breed Identi cation from Hand-crafted Visual Muzzle Descriptors(2023) Chakraborty, ShoubhikBreed classification of pigs based on muzzle images has been attempted in this thesis. Limited, noisy, heterogeneous visual data stemming from MUZZLE images taken from Pigs belonging to different breeds pose many challenges, not just from the point of view of identifying and isolating those features and statistics which are discriminatory in nature, but also from the point of view of constructing a suitable breed-centric model (aided by an inferencing mechanism), which is robust and stable. The work in this light has three primary contributions: Designing and selecting a set of Handcrafted Colour and Texture based visual descriptors which are breed-discriminatory. Devising a feature-specific siphoning policy and model for segregating breeds serially. Using Spanning Trees in DUAL MODE (MIN-tree and MAX-tree forms) for binding breed-specific features and devising a NOVEL test-point INDUCTION procedure for producing an OUTLIER score, whether the point is in the INTERIOR or EXTERIOR of the breed-cluster. Given the diversity of data on hand and the limited training set available to build the model, CROSS-testing results were very promising: DUROC-breed (93.85%), GHUNGROO (97.48%), HAMPSHIRE (94.27%) and YORKSHIRE (100%).Item Acoustic Charge Transport in Organic Semiconductors using Surface Acoustic Wave Devices(2024) Mishra, HimakshiA surface acoustic wave (SAW) is a periodic deformation of the surface of an elastic material propagating at the surface primarily as a linear wave front. Despite the fact that their existence had already been established by Lord Rayleigh in 1885, it wasn't until 1965, through the development of the interdigital transducer (IDT), that they were first utilised for various applications. It is now feasible to stimulate and detect SAWs on a piezoelectric surface in an effective manner. It is established that SAW devices have a very broad range of applications in several fields. Professional radar and communications systems extensively use SAW delay lines, band pass filters, resonators, oscillators, and matched filters. SAW can also be employed as a pressure, humidity, and temperature sensor for chemical sensing and analysis purposes. SAW has very low velocity and narrow wavelengths, reducing size and weight and hence, can be mass manufactured. When a semiconductor comes into interaction with SAW, the acoustic deformations induced by SAW have a significant impact on the semiconductor's energy bands and, consequently, its electrical characteristics. SAW-induced band edge modulation leads to the spatial separation of charge carriers of a semiconductor. Furthermore, the energy and momentum carried by SAW are transmitted to charge carriers resulting in a dragging force on them. This phenomenon is known as the acoustoelectric effect, and the transport caused by this effect is termed acoustic charge transport (ACT). The process of ACT has been demonstrated by several researchers in inorganic semiconductors either by injecting carriers through an input bias or optically generating carriers. Organic semiconductors are increasingly being used as the active layer in a wide variety of innovative technologies due to their solution-processability, lightweight, and flexibility. In contrast to inorganic semiconductors, organic materials form a polycrystalline layer, and their charge transport is mostly limited by grain boundaries. Numerous studies have been done, throughout the past few years, to investigate the factors affecting and contributing to the charge transport of organic semiconductors. However, the interaction of an acoustic wave with these materials has not been reported yet. The primary objective of the thesis is to observe the charge transport of ambipolar electrons and holes in organic semiconductor films by means of acoustic waves and to investigate potential acousto-optic applications that may result from this interaction.Item Evaluation of Out-of-Breath Speech Using Machine Learning Approaches(2024) Sahoo, SibasisStress alters the speech production mechanism. Factors like emotion, cognitive load, pathology, noisy condition (Lombard effect), physical load, sleep deprivation, etc., affect speech production. Among these, speech under emotional, noisy, and pathological conditions are investigated extensively. Little light has been shed on speech under physical load conditions, called out-of-breath speech. Such evaluation of out-of-breath conditions can be used in context-aware speech interfaces to estimate the workload level, exercise intensity of an athlete, and physical fitness of a person.Item Design of RRAM-Based Integrate and Fire Neuron And Programmable Synapse for Neuromorphic Computing(2024) Dongre, Ashvinikumar PruthvirajA human brain can perform compute-intensive tasks, such as multi-object recognition, reasoning, and decision-making, consuming only 20 W power. Whereas, to recognize 1000 different objects, a CPU consumes around 250 W power. Around 1011 neurons in the human brain are interconnected through approximately 1015 synapses responsible for the brain’s exceptional computing capacity. The advancements in processing technology have reduced the technology nodes drastically, which further reduced the power consumption of the processors; still, they cannot match the low power consumption of the human brain. Even with the latest technological advancements, optimizing the processors with Von Neumann architectures for speed and power becomes challenging because of the memory BottleneckItem Segmentation-based Approaches to Pre-operative and Intra-operative Brain Ultrasound Image Registration(2024) Chel, HaradhanThe human brain is made of soft tissues that floats on cerebrospinal fluid (CSF), and it frequently shifts during a surgical process. The brain-shift prevents a neuro-navigation (NN) system from locating the diseased region. A brain ultrasound (BUS) imaging system is utilized to monitor the surgical procedure. Brain-shift can be corrected by registering the pre-operative brain ultrasound (pBUS) and the corresponding intra-operative brain ultrasound (iBUS) images. The similarity between the pBUS and the corresponding iBUS image is affected for a variety of reasons, which makes the registration difficult. This thesis developed three methods to extract similar regions in pBUS and iBUS images and register using these regions. The first method finds the common edge-rich regions from the registering image pair and is followed by the registration of those edge-rich regions through the minimization of the mean-squared registration error. The second method proposes a fast and fully automatic method for extracting the hyper-echoic(HE) regions from the registering image pairs. The patch-based approach makes the segmentation faster and robust to noise. The segmented HE regions are registered by minimizing the registration error. The third approach adopts a patch-based level-set strategy for segmenting three prominent HE regions namely, the longitudinal fissure, choroid plexus, tumor, and two anechoic regions namely, the ventricles and the resection cavity. A registration method is followed on the segmented image sections. Various gradient-based and heuristic optimizations are used for minimizing the mean-squared registration error during registration. Experiments were conducted on selected image pairs from the RESECT and the BITE datasets. For performance evaluation, the segmented ground truth images are prepared by annotating the boundaries of different regions in coordination with an expert radiologist. For comparing registration performance, common tagpoints are selected from the registering image pairs, and the improvement of mean target registration error (mTRE) after registration is analyzed. Experimental results demonstrate the superiority of the proposed segmentation-based approaches to the state-of-the-art methods.Item Design and Analysis of Memelements for Low Power and Area Efficient High Frequency Applications(2023) Ananda, Y RMemristor, memcapacitor, and meminductor are the three types of memory elements (memelements). Memristor is the fourth fundamental circuit element based on the missing relationship between two electrical quantities, the charge (q) and the flux (φ). The memristor is considered one of the most promising nano-devices among those currently being studied for possible use in future electronic systems. The best performance features include fast switching speed, high endurance and data retention, low power consumption, high integration density, and CMOS compatibility. Memristors are being explored as a potential technology to replace CMOS for logic-in-memory systems exploiting memristive nonvolatility. It is one of the prominent characteristic features of the memristor, which effectively solves the so-called memory wall problem in conventional von-Neumann architecture. A memristive device is highly nonlinear and non-volatile, which makes this device is better storage element with greater data density than the existing memory devices. In addition, the memristor exhibits switching capability, which is more relevant for implementing logic gates, a realization of Boolean functions, and system designing, such as arithmetic units like adders, subtractors, multipliers and dividers.Item Shouted, Overlapped and Competitive Speech Detection in Indian Television News Debates(2022) Baghel, ShikhaTelevision (TV) news debates present expert opinions, analysis and discussions on contemporary events. These debates play a critical role in navigating public belief and understanding of socio-politically relevant topics. This encourages several agencies to analyze the TV news debate content for monitoring their influence. The availability of huge (and ever increasing) amount of news debate data calls for the necessity of automatic content analysis. TV news debates are generally argumentative in nature. Such arguments are often associated with the presence of shouted, overlapped, and competitive speech. In this context, the present thesis aims to detect these three speech categories in Indian TV news debates. The first contribution of this thesis is the development of an Indian Broadcast News Debate (IBND) corpus containing audio signals from 15 news debates (approximately 13 hours). A multi-level annotation procedure was followed to obtain the final annotations for the three targeted tasks of the thesis. The second contribution lies in the proposal of excitation source based Shouted Speech Detection (SSD). Both handcrafted and learned features from excitation source-based representations are explored for SSD. An autoencoder with Bi-GRU based architecture is used as classifier. The third aim of the thesis is to identify the overlapped speech in TV news debates. Phase-based representations of the speech signals are established as efficient features for Overlapped Speech Detection (OSD) using a CNN-LSTM based classifier. Finally, the shouted and overlapped speech classification network embeddings and their prediction scores are used as features to identify the competitive speech. It has been shown that the detection of competitive speech can be performed efficiently using high-level information of both shouted and overlapped speech.Item Operation and Control of Smart Transformer Based Meshed Hybrid Microgrid(2023) Das, DwijasishThe excessive use of fossil fuels for power generation in the previous decades has led to various environmental concerns. Moreover, such fuels are also with limited availability. These factors have encouraged engineers and scientists to look for alternate renewal energy sources (RES) for power generation. Various RES like solar photo-voltaic (PV), wind, geothermal, etc., have been used for power generation and injection into the electric grid. However, such changing trends come with their own limitations. RES are generally intermittent in nature with widely varying levels of availability throughout the day and round the year. In addition to that, such sources also need power electronic interface for power injection into the electric grid. These factors give rise to various challenges like voltage variations, faults, harmonics in voltages and currents, islanded operation, complexity of control, etc. Various power electronic equipment such as distribution static compensator (DSTATCOM), dynamic voltage restorer (DVR), unified power quality conditioner (UPQC), static transfer switch, static current limiter, etc., are used in the electric grid to address such challenges.Item Avergage Modeling and DC-Link Capacitor Voltage Regulation of SRF-dq Controlled Single-Phase ANPCI for Solar and Wind Power Applications(2022) Missula, Jagath VallabhaiA Voltage Source Inverter (VSI) converts DC voltage to ac voltage with adjustable magnitude and frequency. VSIs have numerous industrial applications, such as, uninterrupted power supplies, adjustable speed drives, High Voltage DC (HVDC) transmission, Flexible AC Transmission Systems (FACTS), renewable power generation, etc. Based on the number of output voltage levels, the VSIs can be classified as two-level inverters and Multi-Level Inverters (MLIs). Due to the high voltage and large power handling capability and reduced Total Harmonic Distortion (THD) in the output voltage, MLIs are preferred to two-level inverters, mostly in the medium and high-power applications. Neutral Point Clamped (NPC) MLI, Flying Capacitor (FC) MLI and cascaded H-bridge MLI are the most popular topologies among the various MLIs available in the literature.Item Design and Implementation of Hardware-Efficient Architectures for FFT Algorithms(2024) Hazarika, JintiThe Fast Fourier Transform (FFT) holds significance across diverse applications in wireless communications, audio, and signal processing. This doctoral thesis addresses the imperative need to enhance hardware efficiency while concurrently minimizing area and power consumption in FFT processors. Extensive efforts by researchers have centered on optimizing FFT algorithms, determining the requisite number of multipliers, adders, and registers, all of which intricately influence power consumption and overall area. These considerations become pivotal constraints in FFT applications, necessitating a judicious trade-off between complexity and performance.Item Investigation on Multi-dynamic Radar System: A concept for Airborne Surveillance Application(2024) Qumar, JavedDuring the last few decades, Stealth technology has proven to be one of the most effective approaches to hiding the target from radar systems. The basic concept of low observable is mainly the reduction of Radar Cross Section (RCS) in direction of the receiver. So, for detecting such targets, concepts of bistatic and multi-static radar attracted substantial attention. However more challenges lie when radar platforms are mobile or airborne. The geometrical structures are studied with different spacing of radars, it is one of the parameters for Bi-static Radar (Baseline distance between transmitter and receiver) to extend the detection coverage over the mono-static radar. The simulation is also made to extend further for multi-dynamic scenarios. Transmitted waveform identification is very important to know the info about the waveform to processing the returned signal accordingly. The simulation is made for transmitter identity based on augmented BPSK/BASK based waveform ID tailored with standard LFM. However, another way of Transmitter ID info is simulated using IFF Mode-S waveform so that IFF waveform can be utilized for waveform ID of the radar.