Application of Adaptive Techniques in Regression Testing for Modern Software Development (open access)

Application of Adaptive Techniques in Regression Testing for Modern Software Development

In this dissertation we investigate the applicability of different adaptive techniques to improve the effectiveness and efficiency of the regression testing. Initially, we introduce the concept of regression testing. We then perform a literature review of current practices and state-of-the-art regression testing techniques. Finally, we advance the regression testing techniques by performing four empirical studies in which we use different types of information (e.g. user session, source code, code commit, etc.) to investigate the effectiveness of each software metric on fault detection capability for different software environments. In our first empirical study, we show the effectiveness of applying user session information for test case prioritization. In our next study, we apply learning from the previous study, and implement a collaborative filtering recommender system for test case prioritization, which uses user sessions and change history information as input parameter, and return the risk score associated with each component. Results of this study show that our recommender system improves the effectiveness of test prioritization; the performance of our approach was particularly noteworthy when we were under time constraints. We then investigate the merits of multi-objective testing over single objective techniques with a graph-based testing framework. Results of this study indicate that the …
Date: August 2019
Creator: Azizi, Maral
System: The UNT Digital Library
Models to Combat Email Spam Botnets and Unwanted Phone Calls (open access)

Models to Combat Email Spam Botnets and Unwanted Phone Calls

With the amount of email spam received these days it is hard to imagine that spammers act individually. Nowadays, most of the spam emails have been sent from a collection of compromised machines controlled by some spammers. These compromised computers are often called bots, using which the spammers can send massive volume of spam within a short period of time. The motivation of this work is to understand and analyze the behavior of spammers through a large collection of spam mails. My research examined a the data set collected over a 2.5-year period and developed an algorithm which would give the botnet features and then classify them into various groups. Principal component analysis was used to study the association patterns of group of spammers and the individual behavior of a spammer in a given domain. This is based on the features which capture maximum variance of information we have clustered. Presence information is a growing tool towards more efficient communication and providing new services and features within a business setting and much more. The main contribution in my thesis is to propose the willingness estimator that can estimate the callee's willingness without his/her involvement, the model estimates willingness level based …
Date: May 2008
Creator: Husna, Husain
System: The UNT Digital Library
Scalable Next Generation Blockchains for Large Scale Complex Cyber-Physical Systems and Their Embedded Systems in Smart Cities (open access)

Scalable Next Generation Blockchains for Large Scale Complex Cyber-Physical Systems and Their Embedded Systems in Smart Cities

The original FlexiChain and its descendants are a revolutionary distributed ledger technology (DLT) for cyber-physical systems (CPS) and their embedded systems (ES). FlexiChain, a DLT implementation, uses cryptography, distributed ledgers, peer-to-peer communications, scalable networks, and consensus. FlexiChain facilitates data structure agreements. This thesis offers a Block Directed Acyclic Graph (BDAG) architecture to link blocks to their forerunners to speed up validation. These data blocks are securely linked. This dissertation introduces Proof of Rapid Authentication, a novel consensus algorithm. This innovative method uses a distributed file to safely store a unique identifier (UID) based on node attributes to verify two blocks faster. This study also addresses CPS hardware security. A system of interconnected, user-unique identifiers allows each block's history to be monitored. This maintains each transaction and the validators who checked the block to ensure trustworthiness and honesty. We constructed a digital version that stays in sync with the distributed ledger as all nodes are linked by a NodeChain. The ledger is distributed without compromising node autonomy. Moreover, FlexiChain Layer 0 distributed ledger is also introduced and can connect and validate Layer 1 blockchains. This project produced a DAG-based blockchain integration platform with hardware security. The results illustrate a practical technique …
Date: July 2023
Creator: Alkhodair, Ahmad Jamal M
System: The UNT Digital Library
Inferring Social and Internal Context Using a Mobile Phone (open access)

Inferring Social and Internal Context Using a Mobile Phone

This dissertation is composed of research studies that contribute to three research areas including social context-aware computing, internal context-aware computing, and human behavioral data mining. In social context-aware computing, four studies are conducted. First, mobile phone user calling behavioral patterns are characterized in forms of randomness level where relationships among them are then identified. Next, a study is conducted to investigate the relationship between the calling behavior and organizational groups. Third, a method is presented to quantitatively define mobile social closeness and social groups, which are then used to identify social group sizes and scaling ratio. Last, based on the mobile social grouping framework, the significant role of social ties in communication patterns is revealed. In internal context-aware computing, two studies are conducted where the notions of internal context are intention and situation. For intentional context, the goal is to sense the intention of the user in placing calls. A model is thus presented for predicting future calls envisaged as a call predicted list (CPL), which makes use of call history to build a probabilistic model of calling behavior. As an incoming call predictor, CPL is a list of numbers/contacts that are the most likely to be the callers within …
Date: December 2009
Creator: Phithakkitnukoon, Santi
System: The UNT Digital Library

Deep Learning Optimization and Acceleration

The novelty of this dissertation is the optimization and acceleration of deep neural networks aimed at real-time predictions with minimal energy consumption. It consists of cross-layer optimization, output directed dynamic quantization, and opportunistic near-data computation for deep neural network acceleration. On two datasets (CIFAR-10 and CIFAR-100), the proposed deep neural network optimization and acceleration frameworks are tested using a variety of Convolutional neural networks (e.g., LeNet-5, VGG-16, GoogLeNet, DenseNet, ResNet). Experimental results are promising when compared to other state-of-the-art deep neural network acceleration efforts in the literature.
Date: August 2022
Creator: Jiang, Beilei
System: The UNT Digital Library

Integrating Multiple Deep Learning Models for Disaster Description in Low-Altitude Videos

Computer vision technologies are rapidly improving and becoming more important in disaster response. The majority of disaster description techniques now focus either on identify objects or categorize disasters. In this study, we trained multiple deep neural networks on low-altitude imagery with highly imbalanced and noisy labels. We utilize labeled images from the LADI dataset to formulate a solution for general problem in disaster classification and object detection. Our research integrated and developed multiple deep learning models that does the object detection task as well as the disaster scene classification task. Our solution is competitive in the TRECVID Disaster Scene Description and Indexing (DSDI) task, demonstrating that it is comparable to other suggested approaches in retrieving disaster-related video clips.
Date: December 2022
Creator: Wang, Haili
System: The UNT Digital Library
Probabilistic Analysis of Contracting Ebola Virus Using Contextual Intelligence (open access)

Probabilistic Analysis of Contracting Ebola Virus Using Contextual Intelligence

The outbreak of the Ebola virus was declared a Public Health Emergency of International Concern by the World Health Organisation (WHO). Due to the complex nature of the outbreak, the Centers for Disease Control and Prevention (CDC) had created interim guidance for monitoring people potentially exposed to Ebola and for evaluating their intended travel and restricting the movements of carriers when needed. Tools to evaluate the risk of individuals and groups of individuals contracting the disease could mitigate the growing anxiety and fear. The goal is to understand and analyze the nature of risk an individual would face when he/she comes in contact with a carrier. This thesis presents a tool that makes use of contextual data intelligence to predict the risk factor of individuals who come in contact with the carrier.
Date: May 2017
Creator: Gopalakrishnan, Arjun
System: The UNT Digital Library

Blockchain for AI: Smarter Contracts to Secure Artificial Intelligence Algorithms

In this dissertation, I investigate the existing smart contract problems that limit cognitive abilities. I use Taylor's serious expansion, polynomial equation, and fraction-based computations to overcome the limitations of calculations in smart contracts. To prove the hypothesis, I use these mathematical models to compute complex operations of naive Bayes, linear regression, decision trees, and neural network algorithms on Ethereum public test networks. The smart contracts achieve 95\% prediction accuracy compared to traditional programming language models, proving the soundness of the numerical derivations. Many non-real-time applications can use our solution for trusted and secure prediction services.
Date: July 2023
Creator: Badruddoja, Syed
System: The UNT Digital Library
Sentence Similarity Analysis with Applications in Automatic Short Answer Grading (open access)

Sentence Similarity Analysis with Applications in Automatic Short Answer Grading

In this dissertation, I explore unsupervised techniques for the task of automatic short answer grading. I compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. I continue to combine graph alignment features with lexical semantic similarity measures and employ machine learning techniques to show that grade assignment error can be reduced compared to a system that considers only lexical semantic measures of similarity. I also detail a preliminary attempt to align the dependency graphs of student and instructor answers in order to utilize a structural component that is necessary to simulate human-level grading of student answers. I further explore the utility of these techniques to several related tasks in natural language processing including the detection of text similarity, paraphrase, and textual entailment.
Date: August 2012
Creator: Mohler, Michael A. G.
System: The UNT Digital Library
The Procedural Generation of Interesting Sokoban Levels (open access)

The Procedural Generation of Interesting Sokoban Levels

As video games continue to become larger, more complex, and more costly to produce, research into methods to make game creation easier and faster becomes more valuable. One such research topic is procedural generation, which allows the computer to assist in the creation of content. This dissertation presents a new algorithm for the generation of Sokoban levels. Sokoban is a grid-based transport puzzle which is computational interesting due to being PSPACE-complete. Beyond just generating levels, the question of whether or not the levels created by this algorithm are interesting to human players is explored. A study was carried out comparing player attention while playing hand made levels versus their attention during procedurally generated levels. An auditory Stroop test was used to measure attention without disrupting play.
Date: May 2015
Creator: Taylor, Joshua
System: The UNT Digital Library
Secure and Trusted Execution Framework for Virtualized Workloads (open access)

Secure and Trusted Execution Framework for Virtualized Workloads

In this dissertation, we have analyzed various security and trustworthy solutions for modern computing systems and proposed a framework that will provide holistic security and trust for the entire lifecycle of a virtualized workload. The framework consists of 3 novel techniques and a set of guidelines. These 3 techniques provide necessary elements for secure and trusted execution environment while the guidelines ensure that the virtualized workload remains in a secure and trusted state throughout its lifecycle. We have successfully implemented and demonstrated that the framework provides security and trust guarantees at the time of launch, any time during the execution, and during an update of the virtualized workload. Given the proliferation of virtualization from cloud servers to embedded systems, techniques presented in this dissertation can be implemented on most computing systems.
Date: August 2018
Creator: Kotikela, Srujan D
System: The UNT Digital Library
SIMON: A Domain-Agnostic Framework for Secure Design and Validation of Cyber Physical Systems (open access)

SIMON: A Domain-Agnostic Framework for Secure Design and Validation of Cyber Physical Systems

Cyber physical systems (CPS) are an integration of computational and physical processes, where the cyber components monitor and control physical processes. Cyber-attacks largely target the cyber components with the intention of disrupting the functionality of the components in the physical domain. This dissertation explores the role of semantic inference in understanding such attacks and building resilient CPS systems. To that end, we present SIMON, an ontological design and verification framework that captures the intricate relationship(s) between cyber and physical components in CPS by leveraging several standard ontologies and extending the NIST CPS framework for the purpose of eliciting trustworthy requirements, assigning responsibilities and roles to CPS functionalities, and validating that the trustworthy requirements are met by the designed system. We demonstrate the capabilities of SIMON using two case studies – a vehicle to infrastructure (V2I) safety application and an additive manufacturing (AM) printer. In addition, we also present a taxonomy to capture threat feeds specific to the AM domain.
Date: December 2021
Creator: Yanambaka Venkata, Rohith
System: The UNT Digital Library
Extracting Possessions and Their Attributes (open access)

Extracting Possessions and Their Attributes

Possession is an asymmetric semantic relation between two entities, where one entity (the possessee) belongs to the other entity (the possessor). Automatically extracting possessions are useful in identifying skills, recommender systems and in natural language understanding. Possessions can be found in different communication modalities including text, images, videos, and audios. In this dissertation, I elaborate on the techniques I used to extract possessions. I begin with extracting possessions at the sentence level including the type and temporal anchors. Then, I extract the duration of possession and co-possessions (if multiple possessors possess the same entity). Next, I extract possessions from an entire Wikipedia article capturing the change of possessors over time. I extract possessions from social media including both text and images. Finally, I also present dense annotations generating possession timelines. I present separate datasets, detailed corpus analysis, and machine learning models for each task described above.
Date: May 2020
Creator: Chinnappa, Dhivya Infant
System: The UNT Digital Library

Extracting Dimensions of Interpersonal Interactions and Relationships

People interact with each other through natural language to express feelings, thoughts, intentions, instructions etc. These interactions as a result form relationships. Besides names of relationships like siblings, spouse, friends etc., a number of dimensions (e.g. cooperative vs. competitive, temporary vs. enduring, equal vs. hierarchical etc.) can also be used to capture the underlying properties of interpersonal interactions and relationships. More fine-grained descriptors (e.g. angry, rude, nice, supportive etc.) can also be used to indicate the reasons or social-acts behind the dimension cooperative vs. competitive. The way people interact with others may also tell us about their personal traits, which in turn may be indicative of their probable success in their future. The works presented in the dissertation involve creating corpora with fine-grained descriptors of interactions and relationships. We also described experiments and their results that indicated that the processes of identifying the dimensions can be automated.
Date: August 2020
Creator: Rashid, Farzana
System: The UNT Digital Library
E‐Shape Analysis (open access)

E‐Shape Analysis

The motivation of this work is to understand E-shape analysis and how it can be applied to various classification tasks. It has a powerful feature to not only look at what information is contained, but rather how that information looks. This new technique gives E-shape analysis the ability to be language independent and to some extent size independent. In this thesis, I present a new mechanism to characterize an email without using content or context called E-shape analysis for email. I explore the applications of the email shape by carrying out a case study; botnet detection and two possible applications: spam filtering and social-context based finger printing. The second part of this thesis takes what I apply E-shape analysis to activity recognition of humans. Using the Android platform and a T-Mobile G1 phone I collect data from the triaxial accelerometer and use it to classify the motion behavior of a subject.
Date: December 2009
Creator: Sroufe, Paul
System: The UNT Digital Library
A Control Theoretic Approach for Resilient Network Services (open access)

A Control Theoretic Approach for Resilient Network Services

Resilient networks have the ability to provide the desired level of service, despite challenges such as malicious attacks and misconfigurations. The primary goal of this dissertation is to be able to provide uninterrupted network services in the face of an attack or any failures. This dissertation attempts to apply control system theory techniques with a focus on system identification and closed-loop feedback control. It explores the benefits of system identification technique in designing and validating the model for the complex and dynamic networks. Further, this dissertation focuses on designing robust feedback control mechanisms that are both scalable and effective in real-time. It focuses on employing dynamic and predictive control approaches to reduce the impact of an attack on network services. The closed-loop feedback control mechanisms tackle this issue by degrading the network services gracefully to an acceptable level and then stabilizing the network in real-time (less than 50 seconds). Employing these feedback mechanisms also provide the ability to automatically configure the settings such that the QoS metrics of the network is consistent with those specified in the service level agreements.
Date: December 2018
Creator: Vempati, Jagannadh Ambareesh
System: The UNT Digital Library
Modeling Epidemics on Structured Populations: Effects of Socio-demographic Characteristics and Immune Response Quality (open access)

Modeling Epidemics on Structured Populations: Effects of Socio-demographic Characteristics and Immune Response Quality

Epidemiologists engage in the study of the distribution and determinants of health-related states or events in human populations. Eventually, they will apply that study to prevent and control problems and contingencies associated with the health of the population. Due to the spread of new pathogens and the emergence of new bio-terrorism threats, it has become imperative to develop new and expand existing techniques to equip public health providers with robust tools to predict and control health-related crises. In this dissertation, I explore the effects caused in the disease dynamics by the differences in individuals’ physiology and social/behavioral characteristics. Multiple computational and mathematical models were developed to quantify the effect of those factors on spatial and temporal variations of the disease epidemics. I developed statistical methods to measure the effects caused in the outbreak dynamics by the incorporation of heterogeneous demographics and social interactions to the individuals of the population. Specifically, I studied the relationship between demographics and the physiological characteristics of an individual when preparing for an infectious disease epidemic.
Date: August 2014
Creator: Reyes Silveyra, Jorge A.
System: The UNT Digital Library
Ontology Based Security Threat Assessment and Mitigation for Cloud Systems (open access)

Ontology Based Security Threat Assessment and Mitigation for Cloud Systems

A malicious actor often relies on security vulnerabilities of IT systems to launch a cyber attack. Most cloud services are supported by an orchestration of large and complex systems which are prone to vulnerabilities, making threat assessment very challenging. In this research, I developed formal and practical ontology-based techniques that enable automated evaluation of a cloud system's security threats. I use an architecture for threat assessment of cloud systems that leverages a dynamically generated ontology knowledge base. I created an ontology model and represented the components of a cloud system. These ontologies are designed for a set of domains that covers some cloud's aspects and information technology products' cyber threat data. The inputs to our architecture are the configurations of cloud assets and components specification (which encompass the desired assessment procedures) and the outputs are actionable threat assessment results. The focus of this work is on ways of enumerating, assessing, and mitigating emerging cyber security threats. A research toolkit system has been developed to evaluate our architecture. We expect our techniques to be leveraged by any cloud provider or consumer in closing the gap of identifying and remediating known or impending security threats facing their cloud's assets.
Date: December 2018
Creator: Kamongi, Patrick
System: The UNT Digital Library
Procedural Generation of Content for Online Role Playing Games (open access)

Procedural Generation of Content for Online Role Playing Games

Video game players demand a volume of content far in excess of the ability of game designers to create it. For example, a single quest might take a week to develop and test, which means that companies such as Blizzard are spending millions of dollars each month on new content for their games. As a result, both players and developers are frustrated with the inability to meet the demand for new content. By generating content on-demand, it is possible to create custom content for each player based on player preferences. It is also possible to make use of the current world state during generation, something which cannot be done with current techniques. Using developers to create rules and assets for a content generator instead of creating content directly will lower development costs as well as reduce the development time for new game content to seconds rather than days. This work is part of the field of computational creativity, and involves the use of computers to create aesthetically pleasing game content, such as terrain, characters, and quests. I demonstrate agent-based terrain generation, and economic modeling of game spaces. I also demonstrate the autonomous generation of quests for online role playing games, …
Date: August 2014
Creator: Doran, Jonathon
System: The UNT Digital Library
Traffic Forecasting Applications Using Crowdsourced Traffic Reports and Deep Learning (open access)

Traffic Forecasting Applications Using Crowdsourced Traffic Reports and Deep Learning

Intelligent transportation systems (ITS) are essential tools for traffic planning, analysis, and forecasting that can utilize the huge amount of traffic data available nowadays. In this work, we aggregated detailed traffic flow sensor data, Waze reports, OpenStreetMap (OSM) features, and weather data, from California Bay Area for 6 months. Using that data, we studied three novel ITS applications using convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The first experiment is an analysis of the relation between roadway shapes and accident occurrence, where results show that the speed limit and number of lanes are significant predictors for major accidents on highways. The second experiment presents a novel method for forecasting congestion severity using crowdsourced data only (Waze, OSM, and weather), without the need for traffic sensor data. The third experiment studies the improvement of traffic flow forecasting using accidents, number of lanes, weather, and time-related features, where results show significant performance improvements when the additional features where used.
Date: May 2020
Creator: Alammari, Ali
System: The UNT Digital Library

Reliability Characterization and Performance Analysis of Solid State Drives in Data Centers

NAND flash-based solid state drives (SSDs) have been widely adopted in data centers and high performance computing (HPC) systems due to their better performance compared with hard disk drives. However, little is known about the reliability characteristics of SSDs in production systems. Existing works that study the statistical distributions of SSD failures in the field lack insights into distinct characteristics of SSDs. In this dissertation, I explore the SSD-specific SMART (Self-Monitoring, Analysis, and Reporting Technology) attributes and conduct in-depth analysis of SSD reliability in a production environment with a focus on the unique error types and health dynamics. QLC SSD delivers better performance in a cost-effective way. I study QLC SSDs in terms of their architecture and performance. In addition, I apply thermal stress tests to QLC SSDs and quantify their performance degradation processes. Various types of big data and machine learning workloads have been executed on SSDs under varying temperatures. The SSD throughput and application performance are analyzed and characterized.
Date: December 2021
Creator: Liang, Shuwen (Computer science and engineering researcher)
System: The UNT Digital Library
A New Look at Retargetable Compilers (open access)

A New Look at Retargetable Compilers

Consumers demand new and innovative personal computing devices every 2 years when their cellular phone service contracts are renewed. Yet, a 2 year development cycle for the concurrent development of both hardware and software is nearly impossible. As more components and features are added to the devices, maintaining this 2 year cycle with current tools will become commensurately harder. This dissertation delves into the feasibility of simplifying the development of such systems by employing heterogeneous systems on a chip in conjunction with a retargetable compiler such as the hybrid computer retargetable compiler (Hy-C). An example of a simple architecture description of sufficient detail for use with a retargetable compiler like Hy-C is provided. As a software engineer with 30 years of experience, I have witnessed numerous system failures. A plethora of software development paradigms and tools have been employed to prevent software errors, but none have been completely successful. Much discussion centers on software development in the military contracting market, as that is my background. The dissertation reviews those tools, as well as some existing retargetable compilers, in an attempt to determine how those errors occurred and how a system like Hy-C could assist in reducing future software errors. In …
Date: December 2014
Creator: Burke, Patrick William
System: The UNT Digital Library
Improving Memory Performance for Both High Performance Computing and Embedded/Edge Computing Systems (open access)

Improving Memory Performance for Both High Performance Computing and Embedded/Edge Computing Systems

CPU-memory bottleneck is a widely recognized problem. It is known that majority of high performance computing (HPC) database systems are configured with large memories and dedicated to process specific workloads like weather prediction, molecular dynamic simulations etc. My research on optimal address mapping improves the memory performance by increasing the channel and bank level parallelism. In an another research direction, I proposed and evaluated adaptive page migration techniques that obviates the need for offline analysis of an application to determine page migration strategies. Furthermore, I explored different migration strategies like reverse migration, sub page migration that I found to be beneficial depending on the application behavior. Ideally, page migration strategies redirect the demand memory traffic to faster memory to improve the memory performance. In my third contribution, I worked and evaluated a memory-side accelerator to assist the main computational core in locating the non-zero elements of a sparse matrix that are typically used in scientific, machine learning workloads on a low-power embedded system configuration. Thus my contributions narrow the speed-gap by improving the latency and/or bandwidth between CPU and memory.
Date: December 2021
Creator: Adavally, Shashank
System: The UNT Digital Library
Detection of Generalizable Clone Security Coding Bugs Using Graphs and Learning Algorithms (open access)

Detection of Generalizable Clone Security Coding Bugs Using Graphs and Learning Algorithms

This research methodology isolates coding properties and identifies the probability of security vulnerabilities using machine learning and historical data. Several approaches characterize the effectiveness of detecting security-related bugs that manifest as vulnerabilities, but none utilize vulnerability patch information. The main contribution of this research is a framework to analyze LLVM Intermediate Representation Code and merging core source code representations using source code properties. This research is beneficial because it allows source programs to be transformed into a graphical form and users can extract specific code properties related to vulnerable functions. The result is an improved approach to detect, identify, and track software system vulnerabilities based on a performance evaluation. The methodology uses historical function level vulnerability information, unique feature extraction techniques, a novel code property graph, and learning algorithms to minimize the amount of end user domain knowledge necessary to detect vulnerabilities in applications. The analysis shows approximately 99% precision and recall to detect known vulnerabilities in the National Institute of Standards and Technology (NIST) Software Assurance Metrics and Tool Evaluation (SAMATE) project. Furthermore, 72% percent of the historical vulnerabilities in the OpenSSL testing environment were detected using a linear support vector classifier (SVC) model.
Date: December 2018
Creator: Mayo, Quentin R
System: The UNT Digital Library