Enhancing Storage Dependability and Computing Energy Efficiency for Large-Scale High Performance Computing Systems

Access: Use of this item is restricted to the UNT Community
With the advent of information explosion age, larger capacity disk drives are used to store data and powerful devices are used to process big data. As the scale and complexity of computer systems increase, we expect these systems to provide dependable and energy-efficient services and computation. Although hard drives are reliable in general, they are the most commonly replaced hardware components. Disk failures cause data corruption and even data loss, which can significantly affect system performance and financial losses. In this dissertation research, I analyze different manifestations of disk failures in production data centers and explore data mining techniques combined with statistical analysis methods to discover categories of disk failures and their distinctive properties. I use similarity measures to quantify the degradation process of each failure type and derive the degradation signature. The derived degradation signatures are further leveraged to forecast when future disk failures may happen. Meanwhile, this dissertation also studies energy efficiency of high performance computers. Specifically, I characterize the power and energy consumption of Haswell processors which are used in multiple supercomputers, and analyze the power and energy consumption of Legion, a data-centric programming model and runtime system, and Legion applications. We find that power and energy …
Date: May 2019
Creator: Huang, Song
System: The UNT Digital Library

Revealing the Positive Meaning of a Negation

Access: Use of this item is restricted to the UNT Community
Negation is a complex phenomenon present in all human languages, allowing for the uniquely human capacities of denial, contradiction, misrepresentation, lying, and irony. It is in the first place a phenomenon of semantical opposition. Sentences containing negation are generally (a) less informative than affirmative ones, (b) morphosyntactically more marked—all languages have negative markers while only a few have affirmative markers, and (c) psychologically more complex and harder to process. Negation often conveys positive meaning. This meaning ranges from implicatures to entailments. In this dissertation, I develop a system to reveal the underlying positive interpretation of negation. I first identify which words are intended to be negated (i.e, the focus of negation) and second, I rewrite those tokens to generate an actual positive interpretation. I identify the focus of negation by scoring probable foci along a continuous scale. One of the obstacles to exploring foci scoring is that no public datasets exist for this task. Thus, to study this problem I create new corpora. The corpora contain verbal, nominal and adjectival negations and their potential positive interpretations along with their scores ranging from 1 to 5. Then, I use supervised learning models for scoring the focus of negation. In order to …
Date: May 2019
Creator: Sarabi, Zahra
System: The UNT Digital Library

A Performance and Security Analysis of Elliptic Curve Cryptography Based Real-Time Media Encryption

Access: Use of this item is restricted to the UNT Community
This dissertation emphasizes the security aspects of real-time media. The problems of existing real-time media protections are identified in this research, and viable solutions are proposed. First, the security of real-time media depends on the Secure Real-time Transport Protocol (SRTP) mechanism. We identified drawbacks of the existing SRTP Systems, which use symmetric key encryption schemes, which can be exploited by attackers. Elliptic Curve Cryptography (ECC), an asymmetric key cryptography scheme, is proposed to resolve these problems. Second, the ECC encryption scheme is based on elliptic curves. This dissertation explores the weaknesses of a widely used elliptic curve in terms of security and describes a more secure elliptic curve suitable for real-time media protection. Eighteen elliptic curves had been tested in a real-time video transmission system, and fifteen elliptic curves had been tested in a real-time audio transmission system. Based on the performance, X9.62 standard 256-bit prime curve, NIST-recommended 256-bit prime curves, and Brainpool 256-bit prime curves were found to be suitable for real-time audio encryption. Again, X9.62 standard 256-bit prime and 272-bit binary curves, and NIST-recommended 256-bit prime curves were found to be suitable for real-time video encryption.The weaknesses of NIST-recommended elliptic curves are discussed and a more secure new …
Date: December 2019
Creator: Sen, Nilanjan
System: The UNT Digital Library

SurfKE: A Graph-Based Feature Learning Framework for Keyphrase Extraction

Access: Use of this item is restricted to the UNT Community
Current unsupervised approaches for keyphrase extraction compute a single importance score for each candidate word by considering the number and quality of its associated words in the graph and they are not flexible enough to incorporate multiple types of information. For instance, nodes in a network may exhibit diverse connectivity patterns which are not captured by the graph-based ranking methods. To address this, we present a new approach to keyphrase extraction that represents the document as a word graph and exploits its structure in order to reveal underlying explanatory factors hidden in the data that may distinguish keyphrases from non-keyphrases. Experimental results show that our model, which uses phrase graph representations in a supervised probabilistic framework, obtains remarkable improvements in performance over previous supervised and unsupervised keyphrase extraction systems.
Date: August 2019
Creator: Florescu, Corina Andreea
System: The UNT Digital Library

Spatial Partitioning Algorithms for Solving Location-Allocation Problems

Access: Use of this item is restricted to the UNT Community
This dissertation presents spatial partitioning algorithms to solve location-allocation problems. Location-allocations problems pertain to both the selection of facilities to serve demand at demand points and the assignment of demand points to the selected or known facilities. In the first part of this dissertation, we focus on the well known and well-researched location-allocation problem, the "p-median problem", which is a distance-based location-allocation problem that involves selection and allocation of p facilities for n demand points. We evaluate the performance of existing p-median heuristic algorithms and investigate the impact of the scale of the problem, and the spatial distribution of demand points on the performance of these algorithms. Based on the results from this comparative study, we present guidelines for location analysts to aid them in selecting the best heuristic and corresponding parameters depending on the problem at hand. Additionally, we found that existing heuristic algorithms are not suitable for solving large-scale p-median problems in a reasonable amount of time. We present a density-based decomposition methodology to solve large-scale p-median problems efficiently. This algorithm identifies dense clusters in the region and uses a MapReduce procedure to select facilities in the clustered regions independently and combine the solutions from the subproblems. Lastly, …
Date: December 2019
Creator: Gwalani, Harsha
System: The UNT Digital Library