37 Matching Results

Results open in a new window/tab.

Accuracy and Interpretability Testing of Text Mining Methods (open access)

Accuracy and Interpretability Testing of Text Mining Methods

Extracting meaningful information from large collections of text data is problematic because of the sheer size of the database. However, automated analytic methods capable of processing such data have emerged. These methods, collectively called text mining first began to appear in 1988. A number of additional text mining methods quickly developed in independent research silos with each based on unique mathematical algorithms. How good each of these methods are at analyzing text is unclear. Method development typically evolves from some research silo centric requirement with the success of the method measured by a custom requirement-based metric. Results of the new method are then compared to another method that was similarly developed. The proposed research introduces an experimentally designed testing method to text mining that eliminates research silo bias and simultaneously evaluates methods from all of the major context-region text mining method families. The proposed research method follows a random block factorial design with two treatments consisting of three and five levels (RBF-35) with repeated measures. Contribution of the research is threefold. First, the users perceived a difference in the effectiveness of the various methods. Second, while still not clear, there are characteristics with in the text collection that affect the …
Date: August 2013
Creator: Ashton, Triss A.
System: The UNT Digital Library
Application of Spectral Analysis to the Cycle Regression Algorithm (open access)

Application of Spectral Analysis to the Cycle Regression Algorithm

Many techniques have been developed to analyze time series. Spectral analysis and cycle regression analysis represent two such techniques. This study combines these two powerful tools to produce two new algorithms; the spectral algorithm and the one-pass algorithm. This research encompasses four objectives. The first objective is to link spectral analysis with cycle regression analysis to determine an initial estimate of the sinusoidal period. The second objective is to determine the best spectral window and truncation point combination to use with cycle regression for the initial estimate of the sinusoidal period. The third is to determine whether the new spectral algorithm performs better than the old T-value algorithm in estimating sinusoidal parameters. The fourth objective is to determine whether the one-pass algorithm can be used to estimate all significant harmonics simultaneously.
Date: August 1984
Creator: Shah, Vivek
System: The UNT Digital Library
Call Option Premium Dynamics (open access)

Call Option Premium Dynamics

This study has a twofold purpose: to demonstrate the use of the Marquardt compromise method in estimating the unknown parameters contained in the probability call-option pricing models and to test empirically the following models: the Boness, the Black-Scholes, the Merton proportional dividend, the Ingersoll differential tax, and the Ingersoll proportional dividend and differential tax.
Date: December 1982
Creator: Chen, Jim
System: The UNT Digital Library
The Chi Square Approximation to the Hypergeometric Probability Distribution (open access)

The Chi Square Approximation to the Hypergeometric Probability Distribution

This study compared the results of his chi square text of independence and the corrected chi square statistic against Fisher's exact probability test (the hypergeometric distribution) in contection with sampling from a finite population. Data were collected by advancing the minimum call size from zero to a maximum which resulted in a tail area probability of 20 percent for sample sizes from 10 to 100 by varying increments. Analysis of the data supported the rejection of the null hypotheses regarding the general rule-of-thumb guidelines concerning sample size, minimum cell expected frequency and the continuity correction factor. it was discovered that the computation using Yates' correction factor resulted in values which were so overly conservative (i.e. tail area porobabilities that were 20 to 50 percent higher than Fisher's exact test) that conclusions drawn from this calculation might prove to be inaccurate. Accordingly, a new correction factor was proposed which eliminated much of this discrepancy. Its performance was equally consistent with that of the uncorrected chi square statistic and at times, even better.
Date: August 1982
Creator: Anderson, Randy J. (Randy Jay)
System: The UNT Digital Library
Classification by Neural Network and Statistical Models in Tandem: Does Integration Enhance Performance? (open access)

Classification by Neural Network and Statistical Models in Tandem: Does Integration Enhance Performance?

The major purposes of the current research are twofold. The first purpose is to present a composite approach to the general classification problem by using outputs from various parametric statistical procedures and neural networks. The second purpose is to compare several parametric and neural network models on a transportation planning related classification problem and five simulated classification problems.
Date: December 1998
Creator: Mitchell, David
System: The UNT Digital Library
The Comparative Effects of Varying Cell Sizes on Mcnemar's Test with the Χ^2 Test of Independence and T Test for Related Samples (open access)

The Comparative Effects of Varying Cell Sizes on Mcnemar's Test with the Χ^2 Test of Independence and T Test for Related Samples

This study compared the results for McNemar's test, the t test for related measures, and the chi-square test of independence as cell sized varied in a two-by-two frequency table. In this study. the probability results for McNemar's rest, the t test for related measures, and the chi-square test of independence were compared for 13,310 different combinations of cell sizes in a two-by-two design. Several conclusions were reached: With very few exceptions, the t test for related measures and McNemar's test yielded probability results within .002 of each other. The chi-square test seemed to equal the other two tests consistently only when low probabilities less than or equal to .001 were attained. It is recommended that the researcher consider using the t test for related measures as a viable option for McNemar's test except when the researcher is certain he/she is only interested in 'changes'. The chi-square test of independence not only tests a different hypothesis than McNemar's test, but it often yields greatly differing results from McNemar's test.
Date: August 1980
Creator: Black, Kenneth U.
System: The UNT Digital Library
Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers (open access)

Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers

In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two text data computer algorithms that have received much attention individually in the text data literature for topic extraction studies but not for document classification nor for comparison studies. Since classification is considered an important human function and has been studied in the areas of cognitive science and information science, in this dissertation a research study was performed to compare LDA, LSA and humans as document classifiers. The research questions posed in this study are: R1: How accurate is LDA and LSA in classifying documents in a corpus of textual data over a known set of topics? R2: How accurate are humans in performing the same classification task? R3: How does LDA classification performance compare to LSA classification performance? To address these questions, a classification study involving human subjects was designed where humans were asked to generate and classify documents …
Date: December 2011
Creator: Anaya, Leticia H.
System: The UNT Digital Library
Comparing the Powers of Several Proposed Tests for Testing the Equality of the Means of Two Populations When Some Data Are Missing (open access)

Comparing the Powers of Several Proposed Tests for Testing the Equality of the Means of Two Populations When Some Data Are Missing

In comparing the means .of two normally distributed populations with unknown variance, two tests very often used are: the two independent sample and the paired sample t tests. There is a possible gain in the power of the significance test by using the paired sample design instead of the two independent samples design.
Date: May 1994
Creator: Dunu, Emeka Samuel
System: The UNT Digital Library
Derivation of Probability Density Functions for the Relative Differences in the Standard and Poor's 100 Stock Index Over Various Intervals of Time (open access)

Derivation of Probability Density Functions for the Relative Differences in the Standard and Poor's 100 Stock Index Over Various Intervals of Time

In this study a two-part mixed probability density function was derived which described the relative changes in the Standard and Poor's 100 Stock Index over various intervals of time. The density function is a mixture of two different halves of normal distributions. Optimal values for the standard deviations for the two halves and the mean are given. Also, a general form of the function is given which uses linear regression models to estimate the standard deviations and the means. The density functions allow stock market participants trading index options and futures contracts on the S & P 100 Stock Index to determine probabilities of success or failure of trades involving price movements of certain magnitudes in given lengths of time.
Date: August 1988
Creator: Bunger, R. C. (Robert Charles)
System: The UNT Digital Library
Developing Criteria for Extracting Principal Components and Assessing Multiple Significance Tests in Knowledge Discovery Applications (open access)

Developing Criteria for Extracting Principal Components and Assessing Multiple Significance Tests in Knowledge Discovery Applications

With advances in computer technology, organizations are able to store large amounts of data in data warehouses. There are two fundamental issues researchers must address: the dimensionality of data and the interpretation of multiple statistical tests. The first issue addressed by this research is the determination of the number of components to retain in principal components analysis. This research establishes regression, asymptotic theory, and neural network approaches for estimating mean and 95th percentile eigenvalues for implementing Horn's parallel analysis procedure for retaining components. Certain methods perform better for specific combinations of sample size and numbers of variables. The adjusted normal order statistic estimator (ANOSE), an asymptotic procedure, performs the best overall. Future research is warranted on combining methods to increase accuracy. The second issue involves interpreting multiple statistical tests. This study uses simulation to show that Parker and Rothenberg's technique using a density function with a mixture of betas to model p-values is viable for p-values from central and non-central t distributions. The simulation study shows that final estimates obtained in the proposed mixture approach reliably estimate the true proportion of the distributions associated with the null and nonnull hypotheses. Modeling the density of p-values allows for better control of …
Date: August 1999
Creator: Keeling, Kellie Bliss
System: The UNT Digital Library
The Development and Evaluation of a Forecasting System that Incorporates ARIMA Modeling with Autoregression and Exponential Smoothing (open access)

The Development and Evaluation of a Forecasting System that Incorporates ARIMA Modeling with Autoregression and Exponential Smoothing

This research was designed to develop and evaluate an automated alternative to the Box-Jenkins method of forecasting. The study involved two major phases. The first phase was the formulation of an automated ARIMA method; the second was the combination of forecasts from the automated ARIMA with forecasts from two other automated methods, the Holt-Winters method and the Stepwise Autoregressive method. The development of the automated ARIMA, based on a decision criterion suggested by Akaike, borrows heavily from the work of Ang, Chuaa and Fatema. Seasonality and small data set handling were some of the modifications made to the original method to make it suitable for use with a broad range of time series. Forecasts were combined by means of both the simple average and a weighted averaging scheme. Empirical and generated data were employed to perform the forecasting evaluation. The 111 sets of empirical data came from the M-Competition. The twenty-one sets of generated data arose from ARIMA models that Box, Taio and Pack analyzed using the Box-Jenkins method. To compare the forecasting abilities of the Box-Jenkins and the automated ARIMA alone and in combination with the other two methods, two accuracy measures were used. These measures, which are free …
Date: May 1985
Creator: Simmons, Laurette Poulos
System: The UNT Digital Library
Economic Statistical Design of Inverse Gaussian Distribution Control Charts (open access)

Economic Statistical Design of Inverse Gaussian Distribution Control Charts

Statistical quality control (SQC) is one technique companies are using in the development of a Total Quality Management (TQM) culture. Shewhart control charts, a widely used SQC tool, rely on an underlying normal distribution of the data. Often data are skewed. The inverse Gaussian distribution is a probability distribution that is wellsuited to handling skewed data. This analysis develops models and a set of tools usable by practitioners for the constrained economic statistical design of control charts for inverse Gaussian distribution process centrality and process dispersion. The use of this methodology is illustrated by the design of an x-bar chart and a V chart for an inverse Gaussian distributed process.
Date: August 1990
Creator: Grayson, James M. (James Morris)
System: The UNT Digital Library
The Effect of Certain Modifications to Mathematical Programming Models for the Two-Group Classification Problem (open access)

The Effect of Certain Modifications to Mathematical Programming Models for the Two-Group Classification Problem

This research examines certain modifications of the mathematical programming models to improve their classificatory performance. These modifications involve the inclusion of second-order terms and secondary goals in mathematical programming models. A Monte Carlo simulation study is conducted to investigate the performance of two standard parametric models and various mathematical programming models, including the MSD (minimize sum of deviations) model, the MIP (mixed integer programming) model and the hybrid linear programming model.
Date: May 1994
Creator: Wanarat, Pradit
System: The UNT Digital Library
The Effect of Value Co-creation and Service Quality on Customer Satisfaction and Commitment in Healthcare Management (open access)

The Effect of Value Co-creation and Service Quality on Customer Satisfaction and Commitment in Healthcare Management

Despite much interest in service quality and various other service quality measures, scholars appear to have overlooked the overall concept of quality. More specifically, previous research has yet to integrate the effect of the customer network and customer knowledge into the measurement of quality. In this work, it is posited that the evaluation of quality is based on both the delivered value from the provider as well as the value developed from the relationships among customers and between customers and providers. This research examines quality as a broad and complex issue, and uses the “Big Quality” concept within the context of routine healthcare service. The last few decades have witnessed interest and activities surrounding the subject of quality and value co-creation. These are core features of Service-Dominant (S-D) logic theory. In this theory, the customer is a collaborative partner who co-creates value with the firm. Customers create value through the strength of their relations and network, and they take a central role in value actualization as value co-creator. I propose to examine the relationship between quality and the constructs of value co-creation. As well, due to the pivotal role of the decision-making process in customer satisfaction, I will also operationalize …
Date: August 2015
Creator: Kwon, Junhyuk
System: The UNT Digital Library
The Establishment of Helicopter Subsystem Design-to-Cost Estimates by Use of Parametric Cost Estimating Models (open access)

The Establishment of Helicopter Subsystem Design-to-Cost Estimates by Use of Parametric Cost Estimating Models

The purpose of this research was to develop parametric Design-to-Cost models for selected major subsystems of certain helicopters. This was accomplished by analyzing the relationships between historical production costs and certain design parameters which are available during the preliminary design phase of the life cycle. Several potential contributions are identified in the areas of academia, government, and industry. Application of the cost models will provide estimates beneficial to the government and DoD by allowing derivation of realistic Design-to-Cost estimates. In addition, companies in the helicopter industry will benefit by using the models for two key purposes: (1) optimizing helicopter design through cost-effective tradeoffs, and (2) justifying a proposal estimate.
Date: August 1979
Creator: Gilliland, Johnny J.
System: The UNT Digital Library
The Evaluation and Control of the Changes in Basic Statistics Encountered in Grouped Data (open access)

The Evaluation and Control of the Changes in Basic Statistics Encountered in Grouped Data

This dissertation describes the effect that the construction of frequency tables has on basic statistics computed from those frequency tables. It is directly applicable only to normally distributed data summarized by Sturges' Rule. The purpose of this research was to identify factors tending to bias sample statistics when data are summarized, and thus to allow researchers to avoid such bias. The methodology employed was a large scale simulation where 1000 replications of samples of size n = 2 ᵏ⁻¹ for 2 to 12 were drawn from a normally distributed population with a mean of zero and a standard deviation of one. A FORTRAN IV source listing is included. The report concludes that researchers should avoid the use of statistics computed from frequency tables in cases where raw data are available. Where the use of such statistics is unavoidable, the researchers can eliminate their bias by the use of empirical correction factors provided in the paper. Further research is suggested to determine the effect of summarization of data drawn from various non-normal distributions.
Date: May 1979
Creator: Scott, James P.
System: The UNT Digital Library
Financial Leverage and the Cost of Capital (open access)

Financial Leverage and the Cost of Capital

The objective of the research reported in this dissertation is to conduct an empirical test of the hypothesis that, excluding income tax effects, the cost of capital to a firm is independent of the degree of financial leverage employed by the firm. This hypothesis, set forth by Franco Modigliani and Merton Miller in 1958, represents a challenge to the traditional view on the subject, a challenge which carries implications of considerable importance in the field of finance. The challenge has led to a lengthy controversy which can ultimately be resolved only by subjecting the hypothesis to empirical test. The basis of the test was Modigliani and Miller's Proposition II, a corollary of their fundamental hypothesis. Proposition II, in effect, states that equity investors fully discount any increase in risk due to financial leverage so that there is no possibility for the firm to reduce its cost of capital by employing financial leverage. The results of the research reported in this dissertation do not support that contention. The study indicates that, if equity investors require any increase in premium for increasing financial leverage, the premium required is significantly less than that predicted by the Modigliani-Miller Proposition II, over the range of …
Date: December 1977
Creator: Brust, Melvin F.
System: The UNT Digital Library
The Fixed v. Variable Sampling Interval Shewhart X-Bar Control Chart in the Presence of Positively Autocorrelated Data (open access)

The Fixed v. Variable Sampling Interval Shewhart X-Bar Control Chart in the Presence of Positively Autocorrelated Data

This study uses simulation to examine differences between fixed sampling interval (FSI) and variable sampling interval (VSI) Shewhart X-bar control charts for processes that produce positively autocorrelated data. The influence of sample size (1 and 5), autocorrelation parameter, shift in process mean, and length of time between samples is investigated by comparing average time (ATS) and average number of samples (ANSS) to produce an out of control signal for FSI and VSI Shewhart X-bar charts. These comparisons are conducted in two ways: control chart limits pre-set at ±3σ_x / √n and limits computed from the sampling process. Proper interpretation of the Shewhart X-bar chart requires the assumption that observations are statistically independent; however, process data are often autocorrelated over time. Results of this study indicate that increasing the time between samples decreases the effect of positive autocorrelation between samples. Thus, with sufficient time between samples the assumption of independence is essentially not violated. Samples of size 5 produce a faster signal than samples of size 1 with both the FSI and VSI Shewhart X-bar chart when positive autocorrelation is present. However, samples of size 5 require the same time when the data are independent, indicating that this effect is a …
Date: May 1993
Creator: Harvey, Martha M. (Martha Mattern)
System: The UNT Digital Library
A Goal Programming Safety and Health Standards Compliance Model (open access)

A Goal Programming Safety and Health Standards Compliance Model

The purpose of this dissertation was to create a safety compliance model which would advance the state of the art of safety compliance models and provide management with a practical tool which can be used in making safety decisions in an environment where multiple objectives exist. A goal programming safety compliance model (OSHA Model) was developed to fulfill this purpose. The objective function of the OSHA Model was designed to minimize the total deviation from the established goals of the model. These model goals were expressed in terms of 1) level of compliance to OSHA safety and health regulations, 2) company accident frequency rate, 3) company accident cost per worker, and 4) a company budgetary restriction. This particular set of goals was selected to facilitate management's fulfillment of its responsibilities to OSHA, the employees, and to ownership. This study concludes that all the research objectives have been accomplished. The OSHA Model formulated not only advances the state of the art of safety compliance models, but also provides a practical tool which facilitates management's safety and health decisions. The insight into the relationships existing in a safety compliance decision system provided by the OSHA Model and its accompanying sensitivity analysis was …
Date: August 1976
Creator: Ryan, Lanny J.
System: The UNT Digital Library
A Heuristic Procedure for Specifying Parameters in Neural Network Models for Shewhart X-bar Control Chart Applications (open access)

A Heuristic Procedure for Specifying Parameters in Neural Network Models for Shewhart X-bar Control Chart Applications

This study develops a heuristic procedure for specifying parameters for a neural network configuration (learning rate, momentum, and the number of neurons in a single hidden layer) in Shewhart X-bar control chart applications. Also, this study examines the replicability of the neural network solution when the neural network is retrained several times with different initial weights.
Date: December 1993
Creator: Nam, Kyungdoo T.
System: The UNT Digital Library
The Impact of Culture on the Decision Making Process in Restaurants (open access)

The Impact of Culture on the Decision Making Process in Restaurants

Understanding the process of consumers during key purchasing decision points is the margin between success and failure for any business. The cultural differences between the factors that affect consumers in their decision-making process is the motivation of this research. The purpose of this research is to extend the current body of knowledge about decision-making factors by developing and testing a new theoretical model to measure how culture may affect the attitudes and behaviors of consumers in restaurants. This study has its theoretical foundation in the theory of service quality, theory of planned behavior, and rational choice theory. To understand how culture affects the decision-making process and perceived satisfaction, it is necessary to analyze the relationships among the decision factors and attitudes. The findings of this study contribute by building theory and having practical implications for restaurant owners and managers. This study employs a mixed methodology of qualitative and quantitative research. More specifically, the methodologies employed include the development of a framework and testing of that framework via collection of data using semi-structured interviews and a survey instrument. Considering this framework, we test culture as a moderating relationship by using respondents’ birth country, parents’ birth country and ethnic identity. The results …
Date: August 2015
Creator: Boonme, Kittipong
System: The UNT Digital Library
Impact of Forecasting Method Selection and Information Sharing on Supply Chain Performance. (open access)

Impact of Forecasting Method Selection and Information Sharing on Supply Chain Performance.

Effective supply chain management gains much attention from industry and academia because it helps firms across a supply chain to reduce cost and improve customer service level efficiently. Focusing on one of the key challenges of the supply chains, namely, demand uncertainty, this dissertation extends the work of Zhao, Xie, and Leung so as to examine the effects of forecasting method selection coupled with information sharing on supply chain performance in a dynamic business environment. The results of this study showed that under various scenarios, advanced forecasting methods such as neural network and GARCH models play a more significant role when capacity tightness increases and is more important to the retailers than to the supplier under certain circumstances in terms of supply chain costs. Thus, advanced forecasting models should be promoted in supply chain management. However, this study also demonstrated that forecasting methods not capable of modeling features of certain demand patterns significantly impact a supply chain's performance. That is, a forecasting method misspecified for characteristics of the demand pattern usually results in higher supply chain costs. Thus, in practice, supply chain managers should be cognizant of the cost impact of selecting commonly used traditional forecasting methods, such as moving …
Date: December 2009
Creator: Pan, Youqin
System: The UNT Digital Library
The Impact of Quality on Customer Behavioral Intentions Based on the Consumer Decision Making Process As Applied in E-commerce (open access)

The Impact of Quality on Customer Behavioral Intentions Based on the Consumer Decision Making Process As Applied in E-commerce

Perceived quality in the context of e-commerce was defined and examined in numerous studies, but, to date, there are no consistent definitions and measurement scales. Instruments that measure quality in e-commerce industries primarily focus on website quality or service quality during the transaction and delivery phases. Even though some scholars have proposed instruments from different perspectives, these scales do not fully evaluate the level of quality perceived by customers during the entire decision-making process. This dissertation purports to provide five main contributions for the e-commerce, service quality, and decision science literature: (1) development of a comprehensive instrument to measure how online customers perceive the quality of the shopping channel, website, transaction and recovery based on the customer decision making process; (2) identification of the determinants of customer satisfaction and the key dimensions of customer behavioral intentions in e-commerce; (3) examination of the relationships among perceived quality, customer satisfaction and loyalty intention using empirical data; (4) application of different statistical packages (LISREL and PLS-Graph) for data analysis and comparison of how these methods impact the results; and (5) examination of the moderating effects of control variables. A survey was designed and distributed to a total of 1126 college students in a …
Date: August 2012
Creator: Wen, Chao
System: The UNT Digital Library
The Impact of Water Pollution Abatement Costs on Financing of Municipal Services in North Central Texas (open access)

The Impact of Water Pollution Abatement Costs on Financing of Municipal Services in North Central Texas

The purpose of this study is to determine the effects of water pollution control on financing municipal water pollution control facilities in selected cities in North Central Texas. This objective is accomplished by addressing the following topics: (1) the cost to municipalities of meeting federally mandated water pollution control, (2) the sources of funds for financing sewage treatment, and (3) the financial implications of employing these financing tools to satisfy water quality regulations. The study makes the following conclusions regarding the impact of water pollution control costs on municipalities in the North Central Texas Region: 1) The financing of the wastewater treatment requirements of the Water Pollution Control Act Amendments of 1972 will cause many municipalities to report operating deficits for their Water and Sewer Fund. 2) A federal grant program funded at the rate of 75 per cent of waste treatment needs will prevent operating deficits in the majority of cities in which 1990 waste treatment needs constitute 20 per cent or more of the expected Water and Sewer Fund capital structure. 3) A federal grant program funded at the average rate of 35 per cent of needs will benefit only a small number of cities. 4) The federal …
Date: May 1976
Creator: Rucks, Andrew C.
System: The UNT Digital Library