The Value of Everything: Ranking and Association with Encyclopedic Knowledge (open access)

The Value of Everything: Ranking and Association with Encyclopedic Knowledge

This dissertation describes WikiRank, an unsupervised method of assigning relative values to elements of a broad coverage encyclopedic information source in order to identify those entries that may be relevant to a given piece of text. The valuation given to an entry is based not on textual similarity but instead on the links that associate entries, and an estimation of the expected frequency of visitation that would be given to each entry based on those associations in context. This estimation of relative frequency of visitation is embodied in modifications to the random walk interpretation of the PageRank algorithm. WikiRank is an effective algorithm to support natural language processing applications. It is shown to exceed the performance of previous machine learning algorithms for the task of automatic topic identification, providing results comparable to that of human annotators. Second, WikiRank is found useful for the task of recognizing text-based paraphrases on a semantic level, by comparing the distribution of attention generated by two pieces of text using the encyclopedic resource as a common reference. Finally, WikiRank is shown to have the ability to use its base of encyclopedic knowledge to recognize terms from different ontologies as describing the same thing, and thus …
Date: December 2009
Creator: Coursey, Kino High
System: The UNT Digital Library
Infusing Automatic Question Generation with Natural Language Understanding (open access)

Infusing Automatic Question Generation with Natural Language Understanding

Automatically generating questions from text for educational purposes is an active research area in natural language processing. The automatic question generation system accompanying this dissertation is MARGE, which is a recursive acronym for: MARGE automatically reads generates and evaluates. MARGE generates questions from both individual sentences and the passage as a whole, and is the first question generation system to successfully generate meaningful questions from textual units larger than a sentence. Prior work in automatic question generation from text treats a sentence as a string of constituents to be rearranged into as many questions as allowed by English grammar rules. Consequently, such systems overgenerate and create mainly trivial questions. Further, none of these systems to date has been able to automatically determine which questions are meaningful and which are trivial. This is because the research focus has been placed on NLG at the expense of NLU. In contrast, the work presented here infuses the questions generation process with natural language understanding. From the input text, MARGE creates a meaning analysis representation for each sentence in a passage via the DeconStructure algorithm presented in this work. Questions are generated from sentence meaning analysis representations using templates. The generated questions are automatically …
Date: December 2016
Creator: Mazidi, Karen
System: The UNT Digital Library
A New Look at Retargetable Compilers (open access)

A New Look at Retargetable Compilers

Consumers demand new and innovative personal computing devices every 2 years when their cellular phone service contracts are renewed. Yet, a 2 year development cycle for the concurrent development of both hardware and software is nearly impossible. As more components and features are added to the devices, maintaining this 2 year cycle with current tools will become commensurately harder. This dissertation delves into the feasibility of simplifying the development of such systems by employing heterogeneous systems on a chip in conjunction with a retargetable compiler such as the hybrid computer retargetable compiler (Hy-C). An example of a simple architecture description of sufficient detail for use with a retargetable compiler like Hy-C is provided. As a software engineer with 30 years of experience, I have witnessed numerous system failures. A plethora of software development paradigms and tools have been employed to prevent software errors, but none have been completely successful. Much discussion centers on software development in the military contracting market, as that is my background. The dissertation reviews those tools, as well as some existing retargetable compilers, in an attempt to determine how those errors occurred and how a system like Hy-C could assist in reducing future software errors. In …
Date: December 2014
Creator: Burke, Patrick William
System: The UNT Digital Library
Extrapolating Subjectivity Research to Other Languages (open access)

Extrapolating Subjectivity Research to Other Languages

Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the …
Date: May 2013
Creator: Banea, Carmen
System: The UNT Digital Library
Modeling Synergistic Relationships Between Words and Images (open access)

Modeling Synergistic Relationships Between Words and Images

Texts and images provide alternative, yet orthogonal views of the same underlying cognitive concept. By uncovering synergistic, semantic relationships that exist between words and images, I am working to develop novel techniques that can help improve tasks in natural language processing, as well as effective models for text-to-image synthesis, image retrieval, and automatic image annotation. Specifically, in my dissertation, I will explore the interoperability of features between language and vision tasks. In the first part, I will show how it is possible to apply features generated using evidence gathered from text corpora to solve the image annotation problem in computer vision, without the use of any visual information. In the second part, I will address research in the reverse direction, and show how visual cues can be used to improve tasks in natural language processing. Importantly, I propose a novel metric to estimate the similarity of words by comparing the visual similarity of concepts invoked by these words, and show that it can be used further to advance the state-of-the-art methods that employ corpus-based and knowledge-based semantic similarity measures. Finally, I attempt to construct a joint semantic space connecting words with images, and synthesize an evaluation framework to quantify cross-modal …
Date: December 2012
Creator: Leong, Chee Wee
System: The UNT Digital Library
Extracting Possessions and Their Attributes (open access)

Extracting Possessions and Their Attributes

Possession is an asymmetric semantic relation between two entities, where one entity (the possessee) belongs to the other entity (the possessor). Automatically extracting possessions are useful in identifying skills, recommender systems and in natural language understanding. Possessions can be found in different communication modalities including text, images, videos, and audios. In this dissertation, I elaborate on the techniques I used to extract possessions. I begin with extracting possessions at the sentence level including the type and temporal anchors. Then, I extract the duration of possession and co-possessions (if multiple possessors possess the same entity). Next, I extract possessions from an entire Wikipedia article capturing the change of possessors over time. I extract possessions from social media including both text and images. Finally, I also present dense annotations generating possession timelines. I present separate datasets, detailed corpus analysis, and machine learning models for each task described above.
Date: May 2020
Creator: Chinnappa, Dhivya Infant
System: The UNT Digital Library
Sentence Similarity Analysis with Applications in Automatic Short Answer Grading (open access)

Sentence Similarity Analysis with Applications in Automatic Short Answer Grading

In this dissertation, I explore unsupervised techniques for the task of automatic short answer grading. I compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. I continue to combine graph alignment features with lexical semantic similarity measures and employ machine learning techniques to show that grade assignment error can be reduced compared to a system that considers only lexical semantic measures of similarity. I also detail a preliminary attempt to align the dependency graphs of student and instructor answers in order to utilize a structural component that is necessary to simulate human-level grading of student answers. I further explore the utility of these techniques to several related tasks in natural language processing including the detection of text similarity, paraphrase, and textual entailment.
Date: August 2012
Creator: Mohler, Michael A. G.
System: The UNT Digital Library
Capacity and Throughput Optimization in Multi-cell 3G WCDMA Networks (open access)

Capacity and Throughput Optimization in Multi-cell 3G WCDMA Networks

User modeling enables in the computation of the traffic density in a cellular network, which can be used to optimize the placement of base stations and radio network controllers as well as to analyze the performance of resource management algorithms towards meeting the final goal: the calculation and maximization of network capacity and throughput for different data rate services. An analytical model is presented for approximating the user distributions in multi-cell third generation wideband code division multiple access (WCDMA) networks using 2-dimensional Gaussian distributions by determining the means and the standard deviations of the distributions for every cell. This model allows for the calculation of the inter-cell interference and the reverse-link capacity of the network. An analytical model for optimizing capacity in multi-cell WCDMA networks is presented. Capacity is optimized for different spreading factors and for perfect and imperfect power control. Numerical results show that the SIR threshold for the received signals is decreased by 0.5 to 1.5 dB due to the imperfect power control. The results also show that the determined parameters of the 2-dimensional Gaussian model match well with traditional methods for modeling user distribution. A call admission control algorithm is designed that maximizes the throughput in multi-cell …
Date: December 2005
Creator: Nguyen, Son
System: The UNT Digital Library
Reading with Robots: A Platform to Promote Cognitive Exercise through Identification and Discussion of Creative Metaphor in Books (open access)

Reading with Robots: A Platform to Promote Cognitive Exercise through Identification and Discussion of Creative Metaphor in Books

Maintaining cognitive health is often a pressing concern for aging adults, and given the world's shifting age demographics, it is impractical to assume that older adults will be able to rely on individualized human support for doing so. Recently, interest has turned toward technology as an alternative. Companion robots offer an attractive vehicle for facilitating cognitive exercise, but the language technologies guiding their interactions are still nascent; in elder-focused human-robot systems proposed to date, interactions have been limited to motion or buttons and canned speech. The incapacity of these systems to autonomously participate in conversational discourse limits their ability to engage users at a cognitively meaningful level. I addressed this limitation by developing a platform for human-robot book discussions, designed to promote cognitive exercise by encouraging users to consider the authors' underlying intentions in employing creative metaphors. The choice of book discussions as the backdrop for these conversations has an empirical basis in neuro- and social science research that has found that reading often, even in late adulthood, has been correlated with a decreased likelihood to exhibit symptoms of cognitive decline. The more targeted focus on novel metaphors within those conversations stems from prior work showing that processing novel metaphors …
Date: August 2018
Creator: Parde, Natalie
System: The UNT Digital Library
An Efficient Approach for Dengue Mitigation: A Computational Framework (open access)

An Efficient Approach for Dengue Mitigation: A Computational Framework

Dengue mitigation is a major research area among scientist who are working towards an effective management of the dengue epidemic. An effective dengue mitigation requires several other important components. These components include an accurate epidemic modeling, an efficient epidemic prediction, and an efficient resource allocation for controlling of the spread of the dengue disease. Past studies assumed homogeneous response pattern of the dengue epidemic to climate conditions throughout the regions. The dengue epidemic is climate dependent and also it is geographically dependent. A global model is not sufficient to capture the local variations of the epidemic. We propose a novel method of epidemic modeling considering local variation and that uses micro ensemble of regressors for each region. There are three regressors that are used in the construction of the ensemble. These are support vector regression, ordinary least square regression, and a k-nearest neighbor regression. The best performing regressors get selected into the ensemble. The proposed ensemble determines the risk of dengue epidemic in each region in advance. The risk is then used in risk-based resource allocation. The proposing resource allocation is built based on the genetic algorithm. The algorithm exploits the genetic algorithm with major modifications to its main components, …
Date: May 2019
Creator: Dinayadura, Nirosha
System: The UNT Digital Library
Improving Communication and Collaboration Using Artificial Intelligence: An NLP-Enabled Pair Programming Collaborative-ITS Case Study (open access)

Improving Communication and Collaboration Using Artificial Intelligence: An NLP-Enabled Pair Programming Collaborative-ITS Case Study

This dissertation investigates computational models and methods to improve collaboration skills among students. The study targets pair programming, a popular collaborative learning practice in computer science education. This research led to the first machine learning models capable of detecting micromanagement, exclusive language, and other types of collaborative talk during pair programming. The investigation of computational models led to a novel method for adapting pretrained language models by first training them with a multi-task learning objective. I performed computational linguistic analysis of the types of interactions commonly seen in pair programming and obtained computationally tractable features to classify collaborative talk. In addition, I evaluated a novel metric utilized in evaluating the models in this dissertation. This metric is applicable in the areas of affective systems, formative feedback systems and the broader field of computer science. Lastly, I present a computational method, CollabAssist, for providing real-time feedback to improve collaboration. The empirical evaluation of CollabAssist demonstrated a statistically significant reduction in micromanagement during pair programming. Overall, this dissertation contributes to the development of better collaborative learning practices and facilitates greater student learning gains thereby improving students' computer science skills.
Date: July 2023
Creator: Ubani, Solomon
System: The UNT Digital Library

Deep Learning Optimization and Acceleration

The novelty of this dissertation is the optimization and acceleration of deep neural networks aimed at real-time predictions with minimal energy consumption. It consists of cross-layer optimization, output directed dynamic quantization, and opportunistic near-data computation for deep neural network acceleration. On two datasets (CIFAR-10 and CIFAR-100), the proposed deep neural network optimization and acceleration frameworks are tested using a variety of Convolutional neural networks (e.g., LeNet-5, VGG-16, GoogLeNet, DenseNet, ResNet). Experimental results are promising when compared to other state-of-the-art deep neural network acceleration efforts in the literature.
Date: August 2022
Creator: Jiang, Beilei
System: The UNT Digital Library
Procedural Generation of Content for Online Role Playing Games (open access)

Procedural Generation of Content for Online Role Playing Games

Video game players demand a volume of content far in excess of the ability of game designers to create it. For example, a single quest might take a week to develop and test, which means that companies such as Blizzard are spending millions of dollars each month on new content for their games. As a result, both players and developers are frustrated with the inability to meet the demand for new content. By generating content on-demand, it is possible to create custom content for each player based on player preferences. It is also possible to make use of the current world state during generation, something which cannot be done with current techniques. Using developers to create rules and assets for a content generator instead of creating content directly will lower development costs as well as reduce the development time for new game content to seconds rather than days. This work is part of the field of computational creativity, and involves the use of computers to create aesthetically pleasing game content, such as terrain, characters, and quests. I demonstrate agent-based terrain generation, and economic modeling of game spaces. I also demonstrate the autonomous generation of quests for online role playing games, …
Date: August 2014
Creator: Doran, Jonathon
System: The UNT Digital Library
Reliability and Throughput Improvement in Vehicular Communication by Using 5G Technologies (open access)

Reliability and Throughput Improvement in Vehicular Communication by Using 5G Technologies

The vehicular community is moving towards a whole new paradigm with the advancement of new technology. Vehicular communication not only supports safety services but also provides non-safety services like navigation support, toll collection, web browsing, media streaming, etc. The existing communication frameworks like Dedicated Short Range Communication (DSRC) and Cellular V2X (C-V2X) might not meet the required capacity in the coming days. So, the vehicular community needs to adopt new technologies and upgrade the existing communication frameworks so that it can fulfill the desired expectations. Therefore, an increment in reliability and data rate is required. Multiple Input Multiple Output (MIMO), 5G New Radio, Low Density Parity Check (LDPC) Code, and Massive MIMO signal detection and equalization algorithms are the latest addition to the 5G wireless communication domain. These technologies have the potential to make the existing V2X communication framework more robust. As a result, more reliability and throughput can be achieved. This work demonstrates these technologies' compatibility and positive impact on existing V2X communication standard.
Date: December 2022
Creator: Dey, Utpal-Kumar
System: The UNT Digital Library