An Approach Towards Self-Supervised Classification Using Cyc (open access)

An Approach Towards Self-Supervised Classification Using Cyc

Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work.
Date: December 2006
Creator: Coursey, Kino High
System: The UNT Digital Library
CLUE: A Cluster Evaluation Tool (open access)

CLUE: A Cluster Evaluation Tool

Modern high performance computing is dependent on parallel processing systems. Most current benchmarks reveal only the high level computational throughput metrics, which may be sufficient for single processor systems, but can lead to a misrepresentation of true system capability for parallel systems. A new benchmark is therefore proposed. CLUE (Cluster Evaluator) uses a cellular automata algorithm to evaluate the scalability of parallel processing machines. The benchmark also uses algorithmic variations to evaluate individual system components' impact on the overall serial fraction and efficiency. CLUE is not a replacement for other performance-centric benchmarks, but rather shows the scalability of a system and provides metrics to reveal where one can improve overall performance. CLUE is a new benchmark which demonstrates a better comparison among different parallel systems than existing benchmarks and can diagnose where a particular parallel system can be optimized.
Date: December 2006
Creator: Parker, Brandon S.
System: The UNT Digital Library
Comparative Study of RSS-Based Collaborative Localization Methods in Wireless Sensor Networks (open access)

Comparative Study of RSS-Based Collaborative Localization Methods in Wireless Sensor Networks

In this thesis two collaborative localization techniques are studied: multidimensional scaling (MDS) and maximum likelihood estimator (MLE). A synthesis of a new location estimation method through a serial integration of these two techniques, such that an estimate is first obtained using MDS and then MLE is employed to fine-tune the MDS solution, was the subject of this research using various simulation and experimental studies. In the simulations, important issues including the effects of sensor node density, reference node density and different deployment strategies of reference nodes were addressed. In the experimental study, the path loss model of indoor environments is developed by determining the environment-specific parameters from the experimental measurement data. Then, the empirical path loss model is employed in the analysis and simulation study of the performance of collaborative localization techniques.
Date: December 2006
Creator: Koneru, Avanthi
System: The UNT Digital Library

Comparison and Evaluation of Existing Analog Circuit Simulator using Sigma-Delta Modulator

Access: Use of this item is restricted to the UNT Community
In the world of VLSI (very large scale integration) technology, there are many different types of circuit simulators that are used to design and predict the circuit behavior before actual fabrication of the circuit. In this thesis, I compared and evaluated existing circuit simulators by considering standard benchmark circuits. The circuit simulators which I evaluated and explored are Ngspice, Tclspice, Winspice (open source) and Spectre® (commercial). I also tested standard benchmarks using these circuit simulators and compared their outputs. The simulators are evaluated using design metrics in order to quantify their performance and identify efficient circuit simulators. In addition, I designed a sigma-delta modulator and its individual components using the analog behavioral language Verilog-A. Initially, I performed simulations of individual components of the sigma-delta modulator and later of the whole system. Finally, CMOS (complementary metal-oxide semiconductor) transistor-level circuits were designed for the differential amplifier, operational amplifier and comparator of the modulator.
Date: December 2006
Creator: Ale, Anil Kumar
System: The UNT Digital Library

Design and Optimization of Components in a 45nm CMOS Phase Locked Loop

Access: Use of this item is restricted to the UNT Community
A novel scheme of optimizing the individual components of a phase locked loop (PLL) which is used for stable clock generation and synchronization of signals is considered in this work. Verilog-A is used for the high level system design of the main components of the PLL, followed by the individual component wise optimization. The design of experiments (DOE) approach to optimize the analog, 45nm voltage controlled oscillator (VCO) is presented. Also a mixed signal analysis using the analog and digital Verilog behavior of components is studied. Overall a high level system design of a PLL, a systematic optimization of each of its components, and an analog and mixed signal behavioral design approach have been implemented using cadence custom IC design tools.
Date: December 2006
Creator: Sarivisetti, Gayathri
System: The UNT Digital Library
Energy-Aware Time Synchronization in Wireless Sensor Networks (open access)

Energy-Aware Time Synchronization in Wireless Sensor Networks

I present a time synchronization algorithm for wireless sensor networks that aims to conserve sensor battery power. The proposed method creates a hierarchical tree by flooding the sensor network from a designated source point. It then uses a hybrid algorithm derived from the timing-sync protocol for sensor networks (TSPN) and the reference broadcast synchronization method (RBS) to periodically synchronize sensor clocks by minimizing energy consumption. In multi-hop ad-hoc networks, a depleted sensor will drop information from all other sensors that route data through it, decreasing the physical area being monitored by the network. The proposed method uses several techniques and thresholds to maintain network connectivity. A new root sensor is chosen when the current one's battery power decreases to a designated value. I implement this new synchronization technique using Matlab and show that it can provide significant power savings over both TPSN and RBS.
Date: December 2006
Creator: Saravanos, Yanos
System: The UNT Digital Library
Grid-based Coordinated Routing in Wireless Sensor Networks (open access)

Grid-based Coordinated Routing in Wireless Sensor Networks

Wireless sensor networks are battery-powered ad-hoc networks in which sensor nodes that are scattered over a region connect to each other and form multi-hop networks. These nodes are equipped with sensors such as temperature sensors, pressure sensors, and light sensors and can be queried to get the corresponding values for analysis. However, since they are battery operated, care has to be taken so that these nodes use energy efficiently. One of the areas in sensor networks where an energy analysis can be done is routing. This work explores grid-based coordinated routing in wireless sensor networks and compares the energy available in the network over time for different grid sizes.
Date: December 2006
Creator: Sawant, Uttara
System: The UNT Digital Library

A Language and Visual Interface to Specify Complex Spatial Pattern Mining

Access: Use of this item is restricted to the UNT Community
The emerging interests in spatial pattern mining leads to the demand for a flexible spatial pattern mining language, on which easy to use and understand visual pattern language could be built. It is worthwhile to define a pattern mining language called LCSPM to allow users to specify complex spatial patterns. I describe a proposed pattern mining language in this paper. A visual interface which allows users to specify the patterns visually is developed. Visual pattern queries are translated into the LCSPM language by a parser and data mining process can be triggered afterwards. The visual language is based on and goes beyond the visual language proposed in literature. I implemented a prototype system based on the open source JUMP framework.
Date: December 2006
Creator: Li, Xiaohui
System: The UNT Digital Library
Mediation on XQuery Views (open access)

Mediation on XQuery Views

The major goal of information integration is to provide efficient and easy-to-use access to multiple heterogeneous data sources with a single query. At the same time, one of the current trends is to use standard technologies for implementing solutions to complex software problems. In this dissertation, I used XML and XQuery as the standard technologies and have developed an extended projection algorithm to provide a solution to the information integration problem. In order to demonstrate my solution, I implemented a prototype mediation system called Omphalos based on XML related technologies. The dissertation describes the architecture of the system, its metadata, and the process it uses to answer queries. The system uses XQuery expressions (termed metaqueries) to capture complex mappings between global schemas and data source schemas. The system then applies these metaqueries in order to rewrite a user query on a virtual global database (representing the integrated view of the heterogeneous data sources) to a query (termed an outsourced query) on the real data sources. An extended XML document projection algorithm was developed to increase the efficiency of selecting the relevant subset of data from an individual data source to answer the user query. The system applies the projection algorithm …
Date: December 2006
Creator: Peng, Xiaobo
System: The UNT Digital Library

A Multi-Variate Analysis of SMTP Paths and Relays to Restrict Spam and Phishing Attacks in Emails

Access: Use of this item is restricted to the UNT Community
The classifier discussed in this thesis considers the path traversed by an email (instead of its content) and reputation of the relays, features inaccessible to spammers. Groups of spammers and individual behaviors of a spammer in a given domain were analyzed to yield association patterns, which were then used to identify similar spammers. Unsolicited and phishing emails were successfully isolated from legitimate emails, using analysis results. Spammers and phishers are also categorized into serial spammers/phishers, recent spammers/phishers, prospective spammers/phishers, and suspects. Legitimate emails and trusted domains are classified into socially close (family members, friends), socially distinct (strangers etc), and opt-outs (resolved false positives and false negatives). Overall this classifier resulted in far less false positives when compared to current filters like SpamAssassin, achieving a 98.65% precision, which is well comparable to the precisions achieved by SPF, DNSRBL blacklists.
Date: December 2006
Creator: Palla, Srikanth
System: The UNT Digital Library
Natural Language Interfaces to Databases (open access)

Natural Language Interfaces to Databases

Natural language interfaces to databases (NLIDB) are systems that aim to bridge the gap between the languages used by humans and computers, and automatically translate natural language sentences to database queries. This thesis proposes a novel approach to NLIDB, using graph-based models. The system starts by collecting as much information as possible from existing databases and sentences, and transforms this information into a knowledge base for the system. Given a new question, the system will use this knowledge to analyze and translate the sentence into its corresponding database query statement. The graph-based NLIDB system uses English as the natural language, a relational database model, and SQL as the formal query language. In experiments performed with natural language questions ran against a large database containing information about U.S. geography, the system showed good performance compared to the state-of-the-art in the field.
Date: December 2006
Creator: Chandra, Yohan
System: The UNT Digital Library

A Netcentric Scientific Research Repository

Access: Use of this item is restricted to the UNT Community
The Internet and networks in general have become essential tools for disseminating in-formation. Search engines have become the predominant means of finding information on the Web and all other data repositories, including local resources. Domain scientists regularly acquire and analyze images generated by equipment such as microscopes and cameras, resulting in complex image files that need to be managed in a convenient manner. This type of integrated environment has been recently termed a netcentric sci-entific research repository. I developed a number of data manipulation tools that allow researchers to manage their information more effectively in a netcentric environment. The specific contributions are: (1) A unique interface for management of data including files and relational databases. A wrapper for relational databases was developed so that the data can be indexed and searched using traditional search engines. This approach allows data in databases to be searched with the same interface as other data. Fur-thermore, this approach makes it easier for scientists to work with their data if they are not familiar with SQL. (2) A Web services based architecture for integrating analysis op-erations into a repository. This technique allows the system to leverage the large num-ber of existing tools by wrapping them …
Date: December 2006
Creator: Harrington, Brian
System: The UNT Digital Library
Power-benefit analysis of erasure encoding with redundant routing in sensor networks. (open access)

Power-benefit analysis of erasure encoding with redundant routing in sensor networks.

One of the problems sensor networks face is adversaries corrupting nodes along the path to the base station. One way to reduce the effect of these attacks is multipath routing. This introduces some intrusion-tolerance in the network by way of redundancy but at the cost of a higher power consumption by the sensor nodes. Erasure coding can be applied to this scenario in which the base station can receive a subset of the total data sent and reconstruct the entire message packet at its end. This thesis uses two commonly used encodings and compares their performance with respect to power consumed for unencoded data in multipath routing. It is found that using encoding with multipath routing reduces the power consumption and at the same time enables the user to send reasonably large data sizes. The experiments in this thesis were performed on the Tiny OS platform with the simulations done in TOSSIM and the power measurements were taken in PowerTOSSIM. They were performed on the simple radio model and the lossy radio model provided by Tiny OS. The lossy radio model was simulated with distances of 10 feet, 15 feet and 20 feet between nodes. It was found that by …
Date: December 2006
Creator: Vishwanathan, Roopa
System: The UNT Digital Library
Timing and Congestion Driven Algorithms for FPGA Placement (open access)

Timing and Congestion Driven Algorithms for FPGA Placement

Placement is one of the most important steps in physical design for VLSI circuits. For field programmable gate arrays (FPGAs), the placement step determines the location of each logic block. I present novel timing and congestion driven placement algorithms for FPGAs with minimal runtime overhead. By predicting the post-routing timing-critical edges and estimating congestion accurately, this algorithm is able to simultaneously reduce the critical path delay and the minimum number of routing tracks. The core of the algorithm consists of a criticality-history record of connection edges and a congestion map. This approach is applied to the 20 largest Microelectronics Center of North Carolina (MCNC) benchmark circuits. Experimental results show that compared with the state-of-the-art FPGA place and route package, the Versatile Place and Route (VPR) suite, this algorithm yields an average of 8.1% reduction (maximum 30.5%) in the critical path delay and 5% reduction in channel width. Meanwhile, the average runtime of the algorithm is only 2.3X as of VPR.
Date: December 2006
Creator: Zhuo, Yue
System: The UNT Digital Library