Department of Computer Science and Engineering
Browse
Browsing Department of Computer Science and Engineering by Subject "COMPUTER SCIENCE AND ENGINEERING"
Now showing 1 - 20 of 88
Results Per Page
Sort Options
Item 4-4, 1-4: A Novel architecture for data center networks and its performance study(2016) Kumar, Ashk A.R.The advancements in the Internet technologies, changed the service delivery model from in house delivery to delivery through Cloud. This cost effective service delivery created a greater demand for cloud and thus the cloud infrastructure. The major cloud service providers such as Google, IBM, Microsoft use data centers as central resource for their cloud computing. Therefore, data center is defined as a central repository of computing, storage and networking for storing and processing large data and information that can be accessed globally. With data center hosting millions of servers, it faces challenges such as reliability, availability, maintainability and safety. One of the major challenges in the design of data center is to combine very large number of servers into a single network fabric called data center network (DCN) and design protocols that addresses the growing needs of data centers. The major advantages in the design of protocols for DCNs are due the proprietorship of data centers. Since, data centers are private, they can have more controlled structure for the DCNs and often does not face interoperability problem. This leads to many designs proposed for DCNs addressing various aspects of data centers. In our first contribution, we propose a new architecture for DCN called 4-4, 1-4, based on IP address hierarchy to overcome the short comings in the earlier location based routing. In our second contribution we studied performance of 4-4, 1-4 for energy efficiency. In third contribution we proposed packet scheduler for meeting the dealines for the flows. In fourth and last contribution we proposed a redesign of modular data center networks for efficiency.Item Adaptivity and Interface Design: A Human-Computer Interaction Study in E-Learning Applications(2013) Deshpande, Yogesh DComputer based teaching-learning or e-learning provides more flexible methods of interactions with learning contents compared to the traditional classroom set-up. It motivates learners towards self learning and evaluation in an open virtual environment. However, usefulness of e-learning depends upon learner beliefs and the degree of adjustments or adaptations shown by him in his learning behavior. The learning goal and the learning interface have a decisive role in influencing learner adaptations. Various researchers have addressed issues in learner adaptations to the (a) cognitive levels of learning goals and the (b) interaction environment. However they have been addressed separately. Also an efficient methodology of quantifying learner adaptations and learner ability of familiarizing with learning interfaces was lacking. Both these shortcomings have been addressed in this thesis by providing a methodology of measuring adaptations. In this thesis an adaptation score that quantifies adaptations and an adaptivity score that quantifies ability of adapting have been proposed. The thesis attempts to explain the combined impact of learning task complexity and user interface design on learner adaptations in beliefs, interactions and performance which was not done before. Quantitative data of e-learning interactions involving basic three cognitive levels of learning complexity viz. knowledge, comprehension and application and two types of navigation designs viz. hierarchical horizontal menu and non-hierarchical split menu was analyzed. The empirical data suggest the fact that learning task complexity (cognitive level) affects adaptations in interactions between similar tasks (task adaptation) on same interface. Since these task adaptations did not vary across user interfaces, they were found to be task-dependant. As a result, the cognitive load of learning could be judged by the task adaptation score and utilized to adapt pedagogic strategy or learning content. Results of our study reveal that belief in self e-learning skills (self efficacy) affected adaptations in learning behavior and learning performance. On the other hand, adaptivity to navigation design of user interface was found to be interface-dependant and, interestingly, also influenced learning performance. The beliefs were found to mediate the adaptivity scores. Based on the results of the experiments, the thesis provides recommendations on utilization of these metrics in personalization of e-learning on the bases of the adaptations. The study reveals research on the phenomenon of interactions between human and computer using a multidisciplinary view of Human Computer Interaction (HCI) combining computer science, behavioral science and education.Item Algorithms for some Steiner tree problems on Graphs(2020) Saikia, ParikshitIn this research work we study the Steiner tree (ST) problem in the distributed setting. Given a connected undirected graph with non-negative edge weights and a subset of terminal nodes, the goal of the ST problem is to find a minimum-cost tree spanning the terminals. The first contribution is a deterministic distributed algorithm for the ST problem (DST algorithm) in the CONGEST model which guarantees an approximation factor of 2(1 − 1/ℓ), where ℓ is the number of leaf nodes in the optimal ST. It has a round complexity of O(S + √nlog*(n)) and a message complexity of O(Sm + n3/2) for a graph of n nodes and m edges, where S is the shortest path diameter of the graph. The DST algorithm improves the round complexity of the best distributed ST algorithm known so far, which is Õ(S + √(min{St,n}), where t is the number of terminal nodes. We modify the DST algorithm and show that a 2(1 – 1/ℓ)-approximate ST can be deterministically computed using Õ(S+ √n) rounds and Õ(mS) messages in the CONGEST model. The CONGESTED CLIQUE model (CCM) is a special case of the CONGEST model in distributed computing. In this model we propose two deterministic distributed algorithms for the ST problem: STCCM-A and STCCM-B. The first one computes an ST using Õ(n1/3) rounds and Õ(n7/3) messages. The second one computes an ST using O(S + loglog(n)) rounds and O(Sm+n2) messages. Both the algorithms achieve an approximation ratio of 2(1 − 1/ℓ). To the best of our knowledge, this is the first work to study the ST problem in the CCM till date. We also study a generalized version of the ST problem called prize-collecting Steiner tree (PCST). Problems such as MST, ST, Steiner forest, etc. which are related to the PCST, have been widely studied in the distributed setting. However, PCST has seen very little progress in the distributed setting (the only attempt seems to be a manuscript due to Rossetti, an M.Sc. thesis, University of Iceland, Reykjavik, 2015). We present two deterministic distributed algorithms for the PCST problem in the CONGEST model: D-PCST and modified D-PCST. Both the algorithms are based on the primal-dual technique, preserve the dual constraints in a distributed manner, and achieve an approximation factor of (2 – 1/(n-1)). The D-PCST algorithm computes a PCST using O(n2) rounds and O(mn) messages. The modified one computes a PCST using O(Dn) rounds and O(mn) messages, where D is the unweighted diameter of the graph. Both the algorithms require O(Δlog(n)) bits of memory in each node, where Δ is the maximum degree of a node in the graph.Item (An) Online Semi Automated Part of Speech Tagging Technique Applied To Assamese(2013) Dutta, Pallav KumarDeveloping annotated tagged corpora for a language with limited electronic resources can be very demanding. Although Assamese is a language spoken by about 15 million people in the Indian state of Assam as a first language, the development of electronic resources for the language has been lagging behind other Indian languages. Also, there has not been much work done in POS tagging for Assamese. In order to fill this gap, we have designed a POS Tagger for Assamese. Our approach is to use a combination of methods to try and get good results. Further, we amortise the manual intervention over the period of tagging rather than doing all manual work at the beginning. This allows us to quickly start the tagging system. But it also means that what we have is a semi-automatic tagger and not an automatic tagger. Our method requires only native speakers intervention in stages other than the beginning making the system amenable to some form of with a few experts for moderation. This will enable our system to create very large tagged corpora in the language. We first create a knowledge base using a number of methods. This knowledge base is then used to automatically tag sentences in the language. This tagging uses a combination of stemming, application of a few grammatical rules, and a bigram tagger. The tagged sentences are then shown to non-expert native speakers for verification and correction. Before starting the actual tagging process, the knowledge base was tuned by examining the results on small data sets using experts instead of native speakers. The design of a user friendly interface plays an important role in reducing the time taken by native speakers in their examination.Item Anomaly Detection in Oil Well Drilling Operation Using Artificial Intelligence Based Approaches(2021) Tripathi, Achyut ManiArtificial intelligence (AI) based approaches and in particular machine learning techniques have been extensively applied in domains such as health care, computer vision, and network security to build complex and accurate models that can produce more efficient solutions. Oil and gas sector is an area that generates massive and high volume data during the extraction of oil and gases. Stuck pipe, borehole instability, washout, and kick are among the more recurrent problems that occur during drilling operation and cause enormous financial loss to the oil and gas industries. In the ongoing situation, these problems are solved by first principle models and also appeal highly experienced drillers who can prevent such unwanted situations. Machine learning and AI techniques have shown tremendous performance to solve various re- search problems that involve massive real-time data, but their capabilities have not been explored entirely in the domain of oil industries. Still, there is a requirement of data-driven models that can solve the oil well drilling complications. The oil well drilling process needs a mechanical framework, also known as a rig. The rig contains different functional units having multiple sensors that provide the measurements of different hydraulic and mechanical parameters further helpful to monitor the oil well drilling process. The data measured by the rig sensors are stored in a database known as supervisory control and data acquisition (SCADA) system. The data stored in the SCADA system is multivariate time series data. The multivariate time series data stored in the SCADA system can be utilized to develop various machine learning models that can accurately provide the ongoing insight of the oil well drilling process. These data-driven supervisory models can be used for identifying oil well drilling complications. This research work primarily aims at developing AI based models that can be used to realize systems capable of automatically detecting anomalies during oil well drilling operations. The focus is on stuck pipe anomalies that care recurrent during the drilling operation. The above mentioned aim is attained through the following three contributions: The first contribution is development of a hierarchical classifier that identifies oil well drilling activities from the real-time oil well drilling data and also provides a detailed report that shows the percentage of time the drilling activity is performed in one complete cycle of the oil well drilling. The second contribution describes a novel probabilistic model that combines Dynamic Naive Bayesian Classifier and Fuzzy AdaBoost to identify the anomalies that lead to stuck pipe complication during the oil well drilling process. \The last contribution explains novel Contextual Dynamic Bayesian Network that detects contextual anomalies that occur during the oil well drilling process. All the developed models have been tested using real data from various wells located in Assam. The activity detection module has also been validated by deploying it at the well sites and the results are satisfactory.Item An Architectural Framework for Seamless Hand-off between UMTS and WLAN Network(2014) Barooah, MaushumiIn recent years, Cellular wireless technologies like GPRS, UMTS, CDMA and Wireless Local Area Network (WLAN) technologies like IEEE 802.11 have seen a quantum leap in their growth. Cellular technologies can provide data services over a wide area, but with lower data rates. WLAN technologies offer higher data rates, but over smaller areas, popularly known as Spots The demand for an ubiquitous data service can be fulfilled, if it is possible for the end-user to seamlessly roam between these heterogeneous technologies. In this thesis, a novel architectural framework is proposed consisting of an intra-ISP network called Switching Network which is fused between UMTS and WLAN networks as well as data (Internet) services for providing seamless mobility without affecting user activities. The ISN uses MPLS and Multiprotocol-BGP to switch the data traffic between UMTS to IEEE 802.11 networks, as per the movements of the user. The ISN is integrated with the UMTS network at the GGSN-3G and at the Access Point for IEEE 802.11 network respectively. The Mobile Node considered, is a high end device (e.g. PDA or Smart Phone) which is equipped with two interfaces, one for UMTS and the other for WiFi and can use both the interfaces simultaneously. The simulation result shows the improved performance of the ISN based framework over existing schemes. Most of the traffic in today networks use the Transmission Control Protocol (TCP) as the transport layer protocol for reliable end-to-end packet delivery. However, TCP considers packet loss to be the result of network congestion which makes it unsuitable for mobile wireless communication, where sporadic and temporary packet losses are usually due to fading, shadowing, hand-off and other radio effects. During the vertical handoff between different wireless technologies, the problem of end-to-end connection and reliability management for TCP becomes more severe. This thesis also evaluates the performance of TCP over the proposed ISN based framework. The improved TCP scheme uses a cross layer interaction between the network and the transport layer to estimate TCP retransmit timeout and congestion window during handover. Simulation results establishes effectiveness of the proposed scheme. Ensuring Quality of Service(QoS) for the mobile users during vertical handover between IEEE 802.11 and UMTS is another key requirement for seamless mobility and transfer of existing connections from one network to another. The QoS assurance criteria for existing connections can be affected by fluctuations of data rates when a user moves from the high speed WLAN network to the low speed UMTS network, even in the presence of another WLAN network in its vicinity. This can happen if the alternate WLAN network is highly loaded. Therefore handover from a high speed network to a low speed network should be avoided, whenever possible. The final contribution of this thesis proposes a QoS based handover procedure that prioritizes the existing connection over the new connections, so that rate fluctuations due to handover can be avoided if there exist another WLAN network in the range of the mobile user. Whenever the possibility of handover is detected, a pre-handover bandwidth reservation technique is used to reserve bandwidth at the alternate WLAN networks to avoid QoS degradation. The proposed scheme is implemented in Qualnet network simulator and...Item Asymmetric Region Local Binary Patterns for face Image Analysis(2014) Naika C. L., ShrinivasaThis Thesis explores feature extraction techniques based on local binary Patterns(LBP) for automatic face Image Analysis.Item Automatic Language Identification in Online Multilingual Conversations(2021) Sarma, NeelakshiWith the abundance of multilingual content on the Web, Automatic Language Identification (ALI) is an important pre-requisite for different Natural Language Processing applications. While ALI of well-edited text over a fairly distinct collection of languages may be regarded as a trivial problem, ALI in social media text is considered to be a non-trivial task due to the presence of slang words, misspellings, creative spellings, and special elements such as hashtags, user mentions, etc. Additionally, in a multilingual environment, phenomena such as code-mixing and lexical borrowing make the problem even more challenging. Further, the use of the same script to write content in different languages whether due to transliteration or due to shared script between languages imposes additional challenges to language identification. Also, many existing studies in ALI are not suitable for low resource languages due to either of the two reasons. First, the languages may actually lack the resources required like dictionaries, annotated corpus, clean monolingual corpus, etc. Second, the languages may consist of the basic resources in the native scripts, but due to the use of transliterated text, the available resources are rendered useless. Considering the challenges involved, this thesis work aims to address the problem of automatic language identification of code-mixed social media text in transliterated form in a highly multilingual environment. The objective is to use minimal resources so that the proposed techniques can be easily extended to newer languages with fewer resources. Although the language identification techniques explored in this study are generic in nature and not specific to any languages, to conduct various experimental investigations, this study generates three manually annotated and three automatically annotated language identification datasets. The datasets are generated by collecting code-mixed user-comments from a highly multilingual social media environment. Altogether, the datasets are composed of six languages - Assamese, Bengali, Hindi, English, Karbi and Boro. Apart from dataset generation, this thesis work makes four important contributions. First, it studies the language characteristics of user conversations in a highly multilingual environment. Interesting observations with regards to language usages and factors influencing language choices in a multilingual environment are obtained from this study. Second, a technique for sentence-level language identification is proposed taking advantage of the social and conversational features in user conversations. The proposed technique outperforms the baseline set-ups and enhances language identification performance in a code-mixed noisy environment. Third, a word-level language identification framework is proposed that makes use of sentence-level language annotations instead of traditionally used word-level language annotations. The proposed method focuses on learning word-level representations by exploiting sentence-level structural properties to build suitable word-level language classifiers. The proposed technique substantially reduces the manual annotation effort required while yielding encouraging performance. Fourth, a word-level language identification technique is proposed that makes use of a dynamic switching mechanism to enhance word-level language identification performance in a highly multilingual environment. The proposed switching mechanism attempts to make the correct choice between two different classification outcomes when one of the outcomes is incorrect. The proposed framework yields better performance than the constituent classifiers trained over a set of non-complementary features. The proposed set-up also outperforms the baseline set-ups using mini-mum annotated resources and no external resources thus making it suitable for low resource languages. The various automatic language techniques proposed in this study make use of minimal resources. Information obtained from the same set of sentence-level annotated data is used to train both sentence-level as well as wordlevel classification models. As such, the proposed techniques are also deemed suitable for automatic language identification of low resource languages. The proposed techniques are also able to enhance language identification performance in a code-mixed noisy environment.Item Automatic speaker recognition using low resources: Experiments in feature reduction and learning(2018) Kumar, MohitThe main objective of this thesis is to explore experiments about reduction of computations involved in the Automatic Speaker Recognition (ASR) task and about generating representations for speakerrelated information from speech data automatically. ASR systems heavily depend on the features used for representation of speech information. Over the years, there has been a continuous effort to generate features that can represent speech as best as possible. This has led to the use of larger feature sets in speech and speaker recognition systems. However, with the increasing size of the feature set, it may not necessarily be true that all features are equally important for speech representation. We investigate the relevance of individual features in one of popular feature sets, MFCCs.Item Capacity Enhancement, QoS and Rate Adaptation in IEEE 802.11s: A Performance Improvement Perspective(2014) Chakraborty, SandipCurrent deployment of wireless community and municipal area networks provide ubiquitous connectivity to end users through wireless mesh backbone, that aims at replacing wired infrastructure through wireless multi-hop connectivity. IEEE 802.11s standard is published recently to support the mesh connectivity over well-deployed IEEE 802.11 architecture based on Wireless Fidelity (WiFi) access network. This thesis explores a number of research directions to optimize the mesh peering, channel access, scheduling and mesh path selection protocols for IEEE 802.11s mesh network. The standard provides three major protocols to support mesh functionality - Mesh Peer Management Protocol (MPM) to establish mesh connectivity and for topology management, Mesh Coordinated Channel Access (MCCA) for channel access and scheduling, and Hybrid Wireless Mesh Protocol (HWMP) to support mesh path establishment based on link layer characteristics. The objective of this thesis is to augment the existing protocols for better connectivity and e cient usage of the resources. In a mesh network, the e ciency of the backbone network can be improved through directional communication by exploring spatial reuse capability. However, uses of directional antennas impose several new research challenges that are explored in this thesis. The rst contribution of this thesis enhances the functionality of the mesh channel access and path selection protocols to support directional communication over an IEEE 802.11s mesh backbone. Though MCCA provides reservation based channel access, the standard does not implement any speci c mechanism for multi-class tra c services to improve the Quality of Service (QoS) for the end-users. The next contribution in this direction is to provide QoS support and service di erentiation for MCCA based channel access mechanism over the multi-interface communication paradigm. Modern wireless hardwares are capable of providing multiple data rate supports depending on wireless channel dynamics. As a consequence, the MPM protocol has been augmented to support multi-rate adaptation over IEEE 802.11s protocol elements.Item Computational Modeling of Free-viewing Attention on Multimodal Webpages - A Machine Learning Approach(2020) Sandeep, VidyapuWith the progressive expansion of competitive e-commerce and Web resources, attention modeling is essential for Web authors, information creators, advertisers, and Web-designers to understand and predict the user attention on webpages. State-of-the-art models often overlook the design-oriented visual features of constituent web elements, including text and images. The bottleneck was to incorporate the elements' heterogeneous features into the model as texts are represented using features such as `text-size' and `text-color' whereas images are represented using `brightness', `intensity' and `color histograms'. This thesis work is predominantly centered around overcoming the heterogeneity bottleneck to predict the user's free-viewing attention on multi-modal webpages, precisely consisting of text and image modalities. Owing to the prominence of position, primarily, the position-based free-viewing attention allocation is investigated and computationally modeled, separately for text and image elements. The analyses revealed: (i) the elements positioned in the Right and Bottom regions of a webpage are not always ignored; (ii) Space-related (columngap, line-height, padding) and font Size-related (font-size, font-weight) intrinsic text features, and Mid-level Color Histogram intrinsic image features are informative, while position and size are informative for both the types; (iii) the informative visual features predict the ordinal visual attention on an element with 90% average accuracy and 70% micro-F1 score; (iv) For the prominent images, the visual features also help in predicting the weighted-voting-based, kernel-based, and multiple-levels of user attention. Leveraging the prominence of web elements’ visual features, Canonical Correlation Analysis (CCA) based computational approach is proposed to unify both the modalities and to predict the user attention at the granularity of web elements as well as webpages. The results reveal: (i) text and images are unifiable if the interface idiosyncrasies alone or along with user idiosyncrasies are constrained; (ii) The font-families of text are as influential and comparable to image color histogram visual features in achieving the unification. The achieved unification also outperforms the random baseline in predicting the user attention on individual web elements as well as overall webpages. This thesis work finds applications in user attention prediction, web-designing, and user-oriented webpage rendering.Item Consistent Online Backup in Transactional File Systems(2012) Deka, LipikaA consistent backup, preserving data integrity across les in a le system, is of utmost importance for the purpose of correctness and minimizing system downtime during the pro- cess of data recovery. With the present day demand for continuous access to data, backup has to be taken of an active le system, putting the consistency of the backup copy at risk. We propose a scheme referred to as mutual serializability to take a consistent backup of an active le system assuming that the le system supports transactions. The scheme extends the set of con icting operations to include read-read con icts, and it is shown that if the backup transaction is mutually serializable with every other transaction individually, a consistent backup copy is obtained. The user transactions continue to serialize within themselves using some standard concurrency control protocol such as Strict 2PL. Starting with considering only reads an writes, we extend the scheme to include le operations such as directory operations, le descriptor operations and operations such as append, truncate, rename, etc., as well as operations that insert and delete les. We put our scheme into a for- mal framework to prove its correctness, and the formalization as well as the correctness proof is independent of the concurrency control protocol used to serialize the user transactions. The formally proven results are then realized by a practical implementation and evalua- tion of the proposed scheme. In the practical implementation, applications run as a sequence of transactions and under normal circumstances when the backup program is not active, they simply use any standard concurrency control technique such as locking or timestamp based protocols (Strict 2PL in the current implementation) to ensure consistent operations. Now, once the backup program is activated, all other transactions are made aware of it by some triggering mechanism and they now need to serialize themselves with respect to the backup transaction also. If at any moment a con ict arises while establishing the pairwise mutu- ally serializable relationship, the con icting user transaction is either aborted or paused to resolve the con ict. We ensure that the backup transaction completes without ever having to rollback by always ensuring that it reads only from committed transactions and never choosing it as the victim for resolving a con ict. To be able to simulate the proposed technique, we designed and implemented a user space transactional le system prototype that exposes ACID semantics to all applications. We simulated the algorithms devised to realize the proposed technique and ran experiments to help tune the algorithms. The system was simulated through workloads exhibiting a wide range of access patterns and experiments were conducted on each workload in two scenarios, one with the mutual serializability protocol enabled (thus capturing a consistent online backup) and one without (thus capturing an online inconsistent backup) and comparing the results obtained from the two scenarios to calculate the overhead incurred while capturing a consistent backup. The performance evaluation shows that for workloads resembling most present day real workloads exhibiting low inter-transactional sharing and actively accessing only a small percentage of the entire le system space, has very little overheads (2.5% in terms of transactions con icting wit.Item Context Aware Handover for WiFi and Its Extension to WiMAX(2014) Sarma, AbhijitIEEE 802.11 or Fidelity has become a popular wireless technology to offer high speed Internet access at public places called the as well as to support ubiquitous Internet connectivity through institute wide wireless local area networks (WLANs). However, existing researches has shown that due to wide-spread deployments of WiFi based network connectivity zones, more numbers of wireless access points (APs) are deployed than requirements, however, users tend to concentrate at few areas making traffic load imbalance across the network. The design philosophy of IEEE 802.11 connection establishment and handover from one AP to another is based on signal strength which is biased towards the distance between the AP and the client nodes. Severe performance and quality of service (QoS) degradation and capacity underutilization are observed due to this imbalance traffic distribution, which is the main concern of research in this thesis. The first contribution of the thesis explores the inherent problems of IEEE 802.11 handover management policies, and proposes a context-aware handover mechanism to balance traffic load across the network. The proposed mechanism works in coordination of information exchange between the AP and the wireless client that experiences performance degradation due to traffic overload at its present point of attachment. This coordination helps the wireless client to perform a horizontal handover to another AP in the vicinity, that significantly improves the network capacity. The performance of the proposed context aware handover mechanism is analyzed using theoretical analysis as well as from practical testbed results. The second contribution of the thesis extends the context aware handover to incorporate multiple traffic classes, where different traffic classes require different amount of bandwidth to sustain for acceptable quality of experience (QoE) to the end users. Consequently, a class aware load balancing is designed to reserve traffic resources a prior when an impending handover is observed.Item Data Pruning Based Outlier Detection(2015) Pamula, RajendraDue to the advancement of the data storage and processing capabilities of computers, most of the real life applications are shifted to digital domains and many of them are data intensive. In general, most of the applications deal with similar type of data items, but due to variety of reasons some data points are present in the data set which are deviating from the normal behaviors of common data points. Such type of data points are referred as outliers and in general the number of outliers in a data set is less in number. Identifying the outliers from a reasonably big data set is a challenging task. Several methods have been proposed in the literature to identify the outliers, but most of the methods are computation intensive. Due to the diverse nature of data sets, a particular outlier detection method may not be effective for all types of data set. The main focus of this work is to develop algorithms for outlier detection with an emphasis to reduce the number of computations. The number of computations can be reduced if the data set is reduced by removing some data points which are obviously not outliers. The number of computations again depends on the number of attributes of data points. While detecting outliers it may be possible to work with less number of attributes by considering only one attributes from a set of similar or correlated attributes. The objective of this work is to reduce the number of computations while detecting outliers and study the suitability of the method for a particular class of data set. Our methods are based on the clustering techniques and divide the whole data set into several clusters at the beginning. Depending on the nature of the clusters we propose methods to reduce the size of the data sets, and then apply outliers detection method to find the outliers. We propose three methods based on the characteristics of the clusters to identify the clusters that may not contain outliers and such clusters are pruned from the data set. We also propose a method to identify the inlier points from each cluster and prune those points from clusters. We use the principle of data summarization and propose a method that involves both cluster pruning and point pruning. For high dimensional data set, we propose a method that involves attributes pruning to reduce the number of computations while detecting outliers. Once we perform the pruning step, a reduced data set is resulted and then outlier detection techniques are used to detect the outliers. For each method we demonstrate the effectiveness of our proposed methods by performing experiments.Item Decision Diagrams Based On-line Testing of Digital VLSI Circuits(2017) Biswal, Pradeep KumarThe rapid increase in complexity of VLSI circuits with the advent of Deep Sub-Micron (DSM) technology causes development of faults during their normal operation. Such faults cannot be detected by off-line test or Built-In-Self- Test (BIST) techniques, thus, On-line Testing (OLT) is becoming an essential part in Design for Testability (DFT). Most of the existing works presented in the literature on OLT of digital circuits have emphasized on the followings:--non-intrusiveness, totally self-checking, low area overhead, high fault coverage, low detection latency, etc. However,in DSM era, several other factors need to be considered, namely flexibility, coverage for advanced fault models, scalability, handling asynchronous circuits, etc. Considering all these facts, the main objective of this thesis is to design and develop efficient OLT schemes for detection of faults on-the-fly in digital VLSI circuits. In the first contribution of the thesis, we propose an Ordered Binary Decision Diagram (OBDD) based OLT scheme for digital circuits by considering “number of tap points” as a new design parameter to provide flexibility in the OLT perspective. Experimentally, it is seen that minimization of tap points (i.e., measurement limitation) has minimal impact on fault coverage and detection latency but it reduces area overhead of the on-line tester significantly. In the second contribution of the thesis, we propose an OBDD based OLT scheme for both feedback and non-feedback bridging faults. Experimentally, we have seen that consideration of feedback bridging faults along with non-feedback ones, improves fault coverage with marginal increase in the area overhead compared to schemes only involving nonfeedback faults. In the third contribution of the thesis, we propose a High Level Decision Diagram (HLDD) based OLT scheme at Register Transfer Level (RTL) model of circuits in order to improve the scalability.Item Deep Learning-based Techniques for Image and Video Restoration(2022) Sharma, Prasen KumarThe efficiency of several real-time vision tasks severely degrades when presented with noisy or corrupt images or videos taken in adverse rainy or hazy weather conditions. Therefore, it is of utmost importance to propose robust and effective methods that remove the noise and restore the visual quality of the degraded images and videos. Recently, efforts have been afoot towards data-driven approaches due to their improved performance over prior-based schemes. With this motivation, this thesis presents efficient data-driven methods for the following low-level vision tasks: (a) single image de-raining, (b) single image de-hazing, and (c) video de-raining. In what follows are the four significant contributions of this dissertation. In the first contributory chapter, a deep learning-based scheme has been proposed for the task of single image deraining. The designed methodology exploits the spatial domain aspects of the rain-streaks due to their pseudoperiodic nature. In the second contributory chapter, transformed domain characteristics of the rain streaks in the image are exploited for de-noising. Unlike rain-streaks, the haze in an image exponentially varies with the depth of the pixels. Hence, in the third contributory chapter, a scale-space invariant CNN has been presented for the task of single image de-hazing. In the final contributory chapter, the task of video de-raining has been addressed. Unlike the image, video de-raining has an additional complexity of retaining the temporal smoothness in the de-rained videos. Existing approaches tend to separate the spatial and temporal enhancement modules. However, in this work, a unified deep CNN has been presented that simultaneously optimizes the spatial and temporal characteristics of the de-rained videos. Finally, the thesis is concluded by summarizing the significant contributions and proposing some relevant future research directions.Item Delaunay Triangulation based Spanners for MANET(2009) Satyanarayana, D.Many position based routing protocols use Unit Disk Graph (UDG) as an underlying network topology for routing. Due to large number of edges in UDG, these protocols suffer from channel contention overhead, frequent packet collisions, and heavy resource consumption. To overcome these problems, many researchers proposed various local topology control algorithms to retain only linear number of links in the underlying network graph based on geometric neighborhood. These graphs are called geometric spanners. In this thesis, we study these spanners under various network requirements like less number of transmissions, frequent node failure, mobility, and fault tolerance. Geometric spanners, like planarized local Delaunay triangulation (PLDel), relative neighborhood graph (RNG), and gabriel graph (GG) which are based on neighborhood properties contain shorter edges. Because of these shorter edges the number of transmissions between source and destination increases which in turn increases the end-to-end packet delay and jitter. We present three planar constrained based geometric graphs called constrained local Delaunay triangulation (CDT), constrained relative neighborhood graph (CRNG), and constrained Gabriel graph (CGG), to reduce the number of hops by introducing longer constraint edges.In adhoc networks, nodes can go down due to various reasons, such as insuf cient battery power, environmental effects like eruption of volcano, cyclones, and oods, and accidents like landslides and debris. Moreover, to conserve the energy, nodes can switch off their transmitter or go to the sleep mode. There will be heavy packet loss if these nodes exist in any routing path. Similarly, a new node can join the network or an existing node wakes up from sleep mode. We have proposed three dynamic spanners called dynamic local Delaunay triangulation (DLDel), dynamic relative neighborhood graph (DRNG), and dynamic Gabriel graph (DGG), which change their network topology dynamically to preserve the spanner properties and reduce heavy packet loss.Various resource limitations and environmental constraints make frequent link and node failures in adhoc networks, which make the network unreliable. For example, the edge disconnections occur due to buildings, walls, mountains, and obstacles between the wireless nodes. Similarly, the node failures occur due to the exhausted battery power, accidents, landslides, debris, eruption of volcano, and cyclones. So, network topology should be fault tolerant to take care of these failures. In this thesis, we have proposed the algorithms for fault tolerant versions of PLDel, RNG and GG called fault tolerant local Delaunay triangulation (FTLDel), fault tolerant relative neighborhood graph (FTRNG), and fault tolerant Gabriel graph (FTGG), respectively, by choosing most stable nodes. The existing spanners assume that the nodes in the network are static. The frequent topology change due to node mobility disturbs various geometric properties of the spanner such as neighborhood relations, spanning ratio, and planarity. Moreover, some of the edges may become invalid links and may lead to disconnected network. In this thesis, we propose the algorithms for mobile local Delaunay triangulation (MLDel), mobile relative neighborhood graph (MRNG), and mobile Gabriel graph (MGG), to maintain their counter part spanners PLDel, RNG, and GG, respectively, under mobility. These proposed spanners are simulated...Item Density-Based Mining Algorithms for Dynamic Data: An Incremental Approach(2021) Bhattacharjee, PanthadeepTypically an algorithm designed for carrying data mining tasks is fed with a static set of input. This class of algorithms remain prone to certain disadvantages in scenarios where the input data and extracted results change temporally. The prominent bottlenecks may include redundant computation, high response time along with increased consumption of available resources. Given the importance of handling dynamic data in a real time environment eg: traffic monitoring, medical research, recommendation systems etc., this thesis focuses on developing incremental mining algorithms particularly in the eld of density based clustering and outlier detection. Density-based algorithms display robustness in extracting clusters of varying granularity or ltering outliers from variable density sub-spaces. In this thesis, we propose incremental extensions to two density based clustering algorithms: MBSCAN (Mass-based Clustering of Spatial Data with Application of Noise) and SNNDBSCAN (Shared Nearest Neighbor Density Based Clustering of Large Spatial Data with Application of Noise). While dealing with outlier detection, an incremental density based approach is proposed for the K-Nearest Neighbor Outlier Detection algorithm known as KNNOD. The incremental extensions to MBSCAN and KNNOD are approximate in nature facilitating single point insertions. However for SNN-DBSCAN, we propose exact incremental solutions facilitating both addition and deletion of data in batch mode. Our first contribution known as the iMass (Incremental Mass Based Clustering) clustering algorithm o ers an approximate incremental solution to the static MBSCAN algorithm. The goal of this work is to identify the expensive building blocks of MBSCAN and reconstruct them incrementally post every new insertion. Observations combining six real world and two synthetic datasets showed that the proposed iMass algorithm outperformed the naive MBSCAN method by achieving a maximum efficiency upto an order of 2.28 ( 191 times). Around 60.375% of mean clustering accuracy was observed post final insertion for three unlabeled datasets. The cluster quality evaluation through Normalized Mutual Information (NMI), Rand index (RI) measure and F1-score for five class labeled datasets showed similar or improved results for iMass as compared to MBSCAN. The efforts laid in our first contribution therefore motivated us to expand our research towards proposing exact incremental solutions. The second contribution in form of our proposed clustering algorithm BISDBadd (Batch Incremental Shared Nearest Neighbor Density Based Clustering Algorithm for addition) provides an exact incremental solution to the naive SNN-DBSCAN algorithm while adding points in batch mode. BISDBadd comprises of two proposed sub-variant algorithms viz. Batch - Inc1, Batch - Inc2 and is the most efficient comparatively. BISDBadd targets all the components of SNN-DBSCAN incrementally unlike its sub-variant methods. BISDBadd achieved a maximum efficiency upto an order of 3 ( 1000 times) over five (three real world and two synthetic) datasets. An identical cluster similarity was also observed with that of the SNN-DBSCAN algorithm. Complementing addition of data, the third contribution proposes the algorithm BISDBdel (Batch Incremental Shared Nearest Neighbor Density Based Clustering Algorithm for deletion) thereby providing an exact incremental solution to SNNDBSCAN while deleting points in batch mode. Similar to BISDBadd, BISDBdel comprises of two proposed sub-variant algorithms viz. Batch-Dec1, Batch-Dec2 and is the most efficient comparatively. BISDBdel targets all the components of SNN-DBSCAN incrementally when points are deleted from the dataset unlike its sub-variant methods. On comparing with SNN-DBSCAN, the maximum efficiency achieved by BISDBdel reached upto an order of 4 ( 10000 times) over five (three real world and two synthetic) datasets. The set of clusters obtained were identical to the SNN-DBSCAN algorithm. Moving from the paradigm of clustering, our fourth and nal contribution focuses on dynamic extraction of at most top-N global outliers against single point insertions. Our proposed approximate incremental algorithm KAGO (Adaptive Grid Based Outlier Detection Approach using Kernel Density Estimate (KDE)) uses Gaussian kernel in a grid-partitioned space to determine the local density of a point. The local density obtained through KDE is used to filter the local outliers which are integrated to extract at most top-N global outliers. The KAGO algorithm outperformed KNNOD by achieving a maximum efficiency upto an order of 3.91 ( 8304 times) over two intrusion detection datasets and a bidding data for market advertisement related to a search engine. Outliers' evaluation on these datasets using RI and F1-score showed a mean improved accuracy of around 3.3% in case of KAGO. The thesis therefore strives towards developing approximate and exact incremental algorithms in the eld of density-based clustering and outlier detection thereby facilitating real time data analysis.Item Design and Development of Intrusion Detection System: A Discrete Event System Approach(2014) Barbhuiya, Ferdous AhmedWith the rapid increase of security threats in Internet, Intrusion Detection System(IDS), a hardware or software that monitors network or host activities for malicious behavior, is an indispensable component of Network Security. Among the two prevalent IDS designing techniques, signature based IDSs can detect known attacks only while anomaly based systems can detect both known and unknown attacks, but generates large number of false alarms. There are classes of attacks like ARP based attack, ICMP based attack, TCP low rate DoS attack etc. which escape detection by both signature and anomaly IDSs. This thesis proposes a Discrete Event System(DES) based approach to design IDS for attacks across di erent network layers. DES models are designed for the system under normal and failure conditions where attacks are mapped to failures. A state estimator called diagnoser is designed which observes sequences of events generated by the system to decide whether the states through which the system traverses correspond to the normal or faulty DES model. The diagnoser acts as the IDS engine. For detecting ARP based attacks, an active probing mechanism based on ARP requests and responses is used. Active DES framework is adopted to model ARP based attacks using a controllable event (ARP probe) which creates di erence in sequence of events for normal or attack condition. Next, to handle network uncertainties due to presence of congestion, for detecting ICMP based attack, I-diagnosis framework of DES has been adopted where diagnosis is tested only in those sequence of states where a fault is followed by a indicator event. Redundant states of diagnoser of I-diagnosis framework are removed and a reduced detector is also proposed to improve complexity. Further, in Induced Low Rate TCP DoS attack, the attack and genuine sequence of state di ers with some probability. So to detect this attack, Stochastic DES framework has been adapted where attack case can be identified with some probability. Lastly, considering the migration from IPv4 to IPv6 addressing in the Internet, detection mechanism for NDP based attacks of IPv6 network is proposed. To tackle the challenge of presence of error in building complex DES model manually for NDP related attacks in IPv6, LTL based DES framework is adopted. All proposed detection mechanisms are implemented in testbed and the results show the e ectiveness of the systems in terms of accuracy and detection rate.Item Design and Implementation of a File System and a Distributed KV store on Non Volatile Memory(2024) Kalita, ChandanNon-volatile memory (NVRAM) is becoming available. With the availability of hybrid DRAM and NVRAM memory on the memory bus of CPUs, a number of experimental file systems on NVRAM have been designed and implemented. In this thesis we present the design and implementation of a file system on NVRAM called DurableFS, which provides atomicity and durability of file operations to applications. It provides ACID properties to transactions involving multiple files. Due to the byte level random accessibility of memory, it is possible to provide these guarantees without much overhead. We use standard techniques like copy on write for data, and a redo log for metadata changes to build an efficient file system which provides durability and atomicity guarantees to transactions. Benchmarks on the implementation shows that there is only a 7% degradation in performance due to providing these guarantees.