Logic Programming Tools for Dynamic Content Generation and Internet Data Mining

Access: Use of this item is restricted to the UNT Community
The phenomenal growth of Information Technology requires us to elicit, store and maintain huge volumes of data. Analyzing this data for various purposes is becoming increasingly important. Data mining consists of applying data analysis and discovery algorithms that under acceptable computational efficiency limitations, produce a particular enumeration of patterns over the data. We present two techniques based on using Logic programming tools for data mining. Data mining analyzes data by extracting patterns which describe its structure and discovers co-relations in the form of rules. We distinguish analysis methods as visual and non-visual and present one application of each. We explain that our focus on the field of Logic Programming makes some of the very complex tasks related to Web based data mining and dynamic content generation, simple and easy to implement in a uniform framework.
Date: December 2000
Creator: Gupta, Anima
System: The UNT Digital Library

Memory Management and Garbage Collection Algorithms for Java-Based Prolog

Access: Use of this item is restricted to the UNT Community
Implementing a Prolog Runtime System in a language like Java which provides its own automatic memory management and safety features such as built--in index checking and array initialization requires a consistent approach to memory management based on a simple ultimate goal: minimizing total memory management time and extra space involved. The total memory management time for Jinni is made up of garbage collection time both for Java and Jinni itself. Extra space is usually requested at Jinni's garbage collection. This goal motivates us to find a simple and practical garbage collection algorithm and implementation for our Prolog engine. In this thesis we survey various algorithms already proposed and offer our own contribution to the study of garbage collection by improvements and optimizations for some classic algorithms. We implemented these algorithms based on the dynamic array algorithm for an all--dynamic Prolog engine (JINNI 2000). The comparisons of our implementations versus the originally proposed algorithm allow us to draw informative conclusions on their theoretical complexity model and their empirical effectiveness.
Date: August 2001
Creator: Zhou, Qinan
System: The UNT Digital Library

Peptide-based hidden Markov model for peptide fingerprint mapping.

Access: Use of this item is restricted to the UNT Community
Peptide mass fingerprinting (PMF) was the first automated method for protein identification in proteomics, and it remains in common usage today because of its simplicity and the low equipment costs for generating fingerprints. However, one of the problems with PMF is its limited specificity and sensitivity in protein identification. Here I present a method that shows potential to significantly enhance the accuracy of peptide mass fingerprinting, using a machine learning approach based on a hidden Markov model (HMM). This method is applied to improve differentiation of real protein matches from those that occur by chance. The system was trained using 300 examples of combined real and false-positive protein identification results, and 10-fold cross-validation applied to assess model discrimination. The model can achieve 93% accuracy in distinguishing correct and real protein identification results versus false-positive matches. The receiver operating characteristic (ROC) curve area for the best model was 0.833.
Date: December 2004
Creator: Yang, Dongmei
System: The UNT Digital Library

Voting Operating System (VOS)

Access: Use of this item is restricted to the UNT Community
The electronic voting machine (EVM) plays a very important role in a country where government officials are elected into office. Throughout the world, a specific operating system that tends to the specific requirement of the EVM does not exist. Existing EVM technology depends upon the various operating systems currently available, thus ignoring the basic needs of the system. There is a compromise over the basic requirements in order to develop the systems on the basis on an already available operating system, thus having a lot of scope for error. It is necessary to know the specific details of the particular device for which the operating system is being developed. In this document, I evaluate existing EVMs and identify flaws and shortcomings. I propose a solution for a new operating system that meets the specific requirements of the EVM, calling it Voting Operating System (VOS, pronounced 'voice'). The identification technique can be simplified by using the fingerprint technology that determines the identity of a person based on two fingerprints. I also discuss the various parts of the operating system that have to be implemented that can tend to all the basic requirements of an EVM, including implementation of the memory manager, …
Date: December 2004
Creator: Venkatadusumelli, Kiran
System: The UNT Digital Library

Using Reinforcement Learning in Partial Order Plan Space

Access: Use of this item is restricted to the UNT Community
Partial order planning is an important approach that solves planning problems without completely specifying the orderings between the actions in the plan. This property provides greater flexibility in executing plans; hence making the partial order planners a preferred choice over other planning methodologies. However, in order to find partially ordered plans, partial order planners perform a search in plan space rather than in space of world states and an uninformed search in plan space leads to poor efficiency. In this thesis, I discuss applying a reinforcement learning method, called First-visit Monte Carlo method, to partial order planning in order to design agents which do not need any training data or heuristics but are still able to make informed decisions in plan space based on experience. Communicating effectively with the agent is crucial in reinforcement learning. I address how this task was accomplished in plan space and the results from an evaluation of a blocks world test bed.
Date: May 2006
Creator: Ceylan, Hakan
System: The UNT Digital Library

Modeling and reduction of gate leakage during behavioral synthesis of nanoscale CMOS circuits.

Access: Use of this item is restricted to the UNT Community
The major sources of power dissipation in a nanometer CMOS circuit are capacitive switching, short-circuit current, static leakage and gate oxide tunneling. However, with the aggressive scaling of technology the gate oxide direct tunneling current (gate leakage) is emerging as a prominent component of power dissipation. For sub-65 nm CMOS technology where the gate oxide (SiO2) thickness is very low, the direct tunneling current is the major form of tunneling. There are two contribution parts in this thesis: analytical modeling of behavioral level components for direct tunneling current and propagation delay, and the reduction of tunneling current during behavioral synthesis. Gate oxides of multiple thicknesses are useful in reducing the gate leakage dissipation. Analytical models from first principles to calculate the tunneling current and the propagation delay of behavioral level components is presented, which are backed by BSIM4/5 models and SPICE simulations. These components are characterized for 45 nm technology and an algorithm is provided for scheduling of datapath operations such that the overall tunneling current dissipation of a datapath circuit under design is minimal. It is observed that the oxide thickness that is being considered is very low it may not remain constant during the course of fabrication. Hence …
Date: May 2006
Creator: Velagapudi, Ramakrishna
System: The UNT Digital Library

A Language and Visual Interface to Specify Complex Spatial Pattern Mining

Access: Use of this item is restricted to the UNT Community
The emerging interests in spatial pattern mining leads to the demand for a flexible spatial pattern mining language, on which easy to use and understand visual pattern language could be built. It is worthwhile to define a pattern mining language called LCSPM to allow users to specify complex spatial patterns. I describe a proposed pattern mining language in this paper. A visual interface which allows users to specify the patterns visually is developed. Visual pattern queries are translated into the LCSPM language by a parser and data mining process can be triggered afterwards. The visual language is based on and goes beyond the visual language proposed in literature. I implemented a prototype system based on the open source JUMP framework.
Date: December 2006
Creator: Li, Xiaohui
System: The UNT Digital Library

Comparison and Evaluation of Existing Analog Circuit Simulator using Sigma-Delta Modulator

Access: Use of this item is restricted to the UNT Community
In the world of VLSI (very large scale integration) technology, there are many different types of circuit simulators that are used to design and predict the circuit behavior before actual fabrication of the circuit. In this thesis, I compared and evaluated existing circuit simulators by considering standard benchmark circuits. The circuit simulators which I evaluated and explored are Ngspice, Tclspice, Winspice (open source) and Spectre® (commercial). I also tested standard benchmarks using these circuit simulators and compared their outputs. The simulators are evaluated using design metrics in order to quantify their performance and identify efficient circuit simulators. In addition, I designed a sigma-delta modulator and its individual components using the analog behavioral language Verilog-A. Initially, I performed simulations of individual components of the sigma-delta modulator and later of the whole system. Finally, CMOS (complementary metal-oxide semiconductor) transistor-level circuits were designed for the differential amplifier, operational amplifier and comparator of the modulator.
Date: December 2006
Creator: Ale, Anil Kumar
System: The UNT Digital Library

A Multi-Variate Analysis of SMTP Paths and Relays to Restrict Spam and Phishing Attacks in Emails

Access: Use of this item is restricted to the UNT Community
The classifier discussed in this thesis considers the path traversed by an email (instead of its content) and reputation of the relays, features inaccessible to spammers. Groups of spammers and individual behaviors of a spammer in a given domain were analyzed to yield association patterns, which were then used to identify similar spammers. Unsolicited and phishing emails were successfully isolated from legitimate emails, using analysis results. Spammers and phishers are also categorized into serial spammers/phishers, recent spammers/phishers, prospective spammers/phishers, and suspects. Legitimate emails and trusted domains are classified into socially close (family members, friends), socially distinct (strangers etc), and opt-outs (resolved false positives and false negatives). Overall this classifier resulted in far less false positives when compared to current filters like SpamAssassin, achieving a 98.65% precision, which is well comparable to the precisions achieved by SPF, DNSRBL blacklists.
Date: December 2006
Creator: Palla, Srikanth
System: The UNT Digital Library

Parallel Analysis of Aspect-Based Sentiment Summarization from Online Big-Data

Access: Use of this item is restricted to the UNT Community
Consumer's opinions and sentiments on products can reflect the performance of products in general or in various aspects. Analyzing these data is becoming feasible, considering the availability of immense data and the power of natural language processing. However, retailers have not taken full advantage of online comments. This work is dedicated to a solution for automatically analyzing and summarizing these valuable data at both product and category levels. In this research, a system was developed to retrieve and analyze extensive data from public online resources. A parallel framework was created to make this system extensible and efficient. In this framework, a star topological network was adopted in which each computing unit was assigned to retrieve a fraction of data and to assess sentiment. Finally, the preprocessed data were collected and summarized by the central machine which generates the final result that can be rendered through a web interface. The system was designed to have sound performance, robustness, manageability, extensibility, and accuracy.
Date: May 2019
Creator: Wei, Jinliang
System: The UNT Digital Library

Enhanced Approach for the Classification of Ulcerative Colitis Severity in Colonoscopy Videos Using CNN

Access: Use of this item is restricted to the UNT Community
Ulcerative colitis (UC) is a chronic inflammatory disease characterized by periods of relapses and remissions affecting more than 500,000 people in the United States. To achieve the therapeutic goals of UC, which are to first induce and then maintain disease remission, doctors need to evaluate the severity of UC of a patient. However, it is very difficult to evaluate the severity of UC objectively because of non-uniform nature of symptoms and large variations in their patterns. To address this, in our previous works, we developed two different approaches in which one is using the image textures, and the other is using CNN (convolutional neural network) to measure and classify objectively the severity of UC presented in optical colonoscopy video frames. But, we found that the image texture based approach could not handle larger number of variations in their patterns, and the CNN based approach could not achieve very high accuracy. In this paper, we improve our CNN based approach in two ways to provide better accuracy for the classification. We add more thorough and essential preprocessing, and generate more classes to accommodate large variations in their patterns. The experimental results show that the proposed preprocessing can improve the overall accuracy …
Date: August 2019
Creator: Sure, Venkata Leela
System: The UNT Digital Library

Mining Biomedical Data for Hidden Relationship Discovery

Access: Use of this item is restricted to the UNT Community
With an ever-growing number of publications in the biomedical domain, it becomes likely that important implicit connections between individual concepts of biomedical knowledge are overlooked. Literature based discovery (LBD) is in practice for many years to identify plausible associations between previously unrelated concepts. In this paper, we present a new, completely automatic and interactive system that creates a graph-based knowledge base to capture multifaceted complex associations among biomedical concepts. For a given pair of input concepts, our system auto-generates a list of ranked subgraphs uncovering possible previously unnoticed associations based on context information. To rank these subgraphs, we implement a novel ranking method using the context information obtained by performing random walks on the graph. In addition, we enhance the system by training a Neural Network Classifier to output the likelihood of the two concepts being likely related, which provides better insights to the end user.
Date: August 2019
Creator: Dharmavaram, Sirisha
System: The UNT Digital Library

Automated Defense Against Worm Propagation.

Access: Use of this item is restricted to the UNT Community
Worms have caused significant destruction over the last few years. Network security elements such as firewalls, IDS, etc have been ineffective against worms. Some worms are so fast that a manual intervention is not possible. This brings in the need for a stronger security architecture which can automatically react to stop worm propagation. The method has to be signature independent so that it can stop new worms. In this thesis, an automated defense system (ADS) is developed to automate defense against worms and contain the worm to a level where manual intervention is possible. This is accomplished with a two level architecture with feedback at each level. The inner loop is based on control system theory and uses the properties of PID (proportional, integral and differential controller). The outer loop works at the network level and stops the worm to reach its spread saturation point. In our lab setup, we verified that with only inner loop active the worm was delayed, and with both loops active we were able to restrict the propagation to 10% of the targeted hosts. One concern for deployment of a worm containment mechanism was degradation of throughput for legitimate traffic. We found that with proper …
Date: December 2005
Creator: Patwardhan, Sudeep
System: The UNT Digital Library

Design and Optimization of Components in a 45nm CMOS Phase Locked Loop

Access: Use of this item is restricted to the UNT Community
A novel scheme of optimizing the individual components of a phase locked loop (PLL) which is used for stable clock generation and synchronization of signals is considered in this work. Verilog-A is used for the high level system design of the main components of the PLL, followed by the individual component wise optimization. The design of experiments (DOE) approach to optimize the analog, 45nm voltage controlled oscillator (VCO) is presented. Also a mixed signal analysis using the analog and digital Verilog behavior of components is studied. Overall a high level system design of a PLL, a systematic optimization of each of its components, and an analog and mixed signal behavioral design approach have been implemented using cadence custom IC design tools.
Date: December 2006
Creator: Sarivisetti, Gayathri
System: The UNT Digital Library