37 Matching Results

Results open in a new window/tab.

Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers (open access)

Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers

In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two text data computer algorithms that have received much attention individually in the text data literature for topic extraction studies but not for document classification nor for comparison studies. Since classification is considered an important human function and has been studied in the areas of cognitive science and information science, in this dissertation a research study was performed to compare LDA, LSA and humans as document classifiers. The research questions posed in this study are: R1: How accurate is LDA and LSA in classifying documents in a corpus of textual data over a known set of topics? R2: How accurate are humans in performing the same classification task? R3: How does LDA classification performance compare to LSA classification performance? To address these questions, a classification study involving human subjects was designed where humans were asked to generate and classify documents …
Date: December 2011
Creator: Anaya, Leticia H.
System: The UNT Digital Library
The Chi Square Approximation to the Hypergeometric Probability Distribution (open access)

The Chi Square Approximation to the Hypergeometric Probability Distribution

This study compared the results of his chi square text of independence and the corrected chi square statistic against Fisher's exact probability test (the hypergeometric distribution) in contection with sampling from a finite population. Data were collected by advancing the minimum call size from zero to a maximum which resulted in a tail area probability of 20 percent for sample sizes from 10 to 100 by varying increments. Analysis of the data supported the rejection of the null hypotheses regarding the general rule-of-thumb guidelines concerning sample size, minimum cell expected frequency and the continuity correction factor. it was discovered that the computation using Yates' correction factor resulted in values which were so overly conservative (i.e. tail area porobabilities that were 20 to 50 percent higher than Fisher's exact test) that conclusions drawn from this calculation might prove to be inaccurate. Accordingly, a new correction factor was proposed which eliminated much of this discrepancy. Its performance was equally consistent with that of the uncorrected chi square statistic and at times, even better.
Date: August 1982
Creator: Anderson, Randy J. (Randy Jay)
System: The UNT Digital Library
Accuracy and Interpretability Testing of Text Mining Methods (open access)

Accuracy and Interpretability Testing of Text Mining Methods

Extracting meaningful information from large collections of text data is problematic because of the sheer size of the database. However, automated analytic methods capable of processing such data have emerged. These methods, collectively called text mining first began to appear in 1988. A number of additional text mining methods quickly developed in independent research silos with each based on unique mathematical algorithms. How good each of these methods are at analyzing text is unclear. Method development typically evolves from some research silo centric requirement with the success of the method measured by a custom requirement-based metric. Results of the new method are then compared to another method that was similarly developed. The proposed research introduces an experimentally designed testing method to text mining that eliminates research silo bias and simultaneously evaluates methods from all of the major context-region text mining method families. The proposed research method follows a random block factorial design with two treatments consisting of three and five levels (RBF-35) with repeated measures. Contribution of the research is threefold. First, the users perceived a difference in the effectiveness of the various methods. Second, while still not clear, there are characteristics with in the text collection that affect the …
Date: August 2013
Creator: Ashton, Triss A.
System: The UNT Digital Library
The Comparative Effects of Varying Cell Sizes on Mcnemar's Test with the Χ^2 Test of Independence and T Test for Related Samples (open access)

The Comparative Effects of Varying Cell Sizes on Mcnemar's Test with the Χ^2 Test of Independence and T Test for Related Samples

This study compared the results for McNemar's test, the t test for related measures, and the chi-square test of independence as cell sized varied in a two-by-two frequency table. In this study. the probability results for McNemar's rest, the t test for related measures, and the chi-square test of independence were compared for 13,310 different combinations of cell sizes in a two-by-two design. Several conclusions were reached: With very few exceptions, the t test for related measures and McNemar's test yielded probability results within .002 of each other. The chi-square test seemed to equal the other two tests consistently only when low probabilities less than or equal to .001 were attained. It is recommended that the researcher consider using the t test for related measures as a viable option for McNemar's test except when the researcher is certain he/she is only interested in 'changes'. The chi-square test of independence not only tests a different hypothesis than McNemar's test, but it often yields greatly differing results from McNemar's test.
Date: August 1980
Creator: Black, Kenneth U.
System: The UNT Digital Library
A Relationship-based Cross National Customer Decision-making Model in the Service Industry (open access)

A Relationship-based Cross National Customer Decision-making Model in the Service Industry

In 2012, the CIA World Fact Book showed that the service sector contributed about 76.6% and 51.4% of the 2010 gross national product of both the United States and Ghana, respectively. Research in the services area shows that a firm's success in today's competitive business environment is dependent upon its ability to deliver superior service quality. However, these studies have yet to address factors that influence customers to remain committed to a mass service in economically diverse countries. In addition, there is little research on established service quality measures pertaining to the mass service domain. This dissertation applies Rusbult's investment model of relationship commitment and examines its psychological impact on the commitment level of a customer towards a service in two economically diverse countries. In addition, service quality is conceptualized as a hierarchical construct in the mass service (banking) and specific dimensions are developed on which customers assess their quality evaluations. Using, PLS path modeling, a structural equation modeling approach to data analysis, service quality as a hierarchical third-order construct was found to have three primary dimensions and six sub-dimensions. The results also established that a country's national economy has a moderating effect on the relationship between service quality and …
Date: August 2013
Creator: Boakye, Kwabena G.
System: The UNT Digital Library
The Impact of Culture on the Decision Making Process in Restaurants (open access)

The Impact of Culture on the Decision Making Process in Restaurants

Understanding the process of consumers during key purchasing decision points is the margin between success and failure for any business. The cultural differences between the factors that affect consumers in their decision-making process is the motivation of this research. The purpose of this research is to extend the current body of knowledge about decision-making factors by developing and testing a new theoretical model to measure how culture may affect the attitudes and behaviors of consumers in restaurants. This study has its theoretical foundation in the theory of service quality, theory of planned behavior, and rational choice theory. To understand how culture affects the decision-making process and perceived satisfaction, it is necessary to analyze the relationships among the decision factors and attitudes. The findings of this study contribute by building theory and having practical implications for restaurant owners and managers. This study employs a mixed methodology of qualitative and quantitative research. More specifically, the methodologies employed include the development of a framework and testing of that framework via collection of data using semi-structured interviews and a survey instrument. Considering this framework, we test culture as a moderating relationship by using respondents’ birth country, parents’ birth country and ethnic identity. The results …
Date: August 2015
Creator: Boonme, Kittipong
System: The UNT Digital Library
Financial Leverage and the Cost of Capital (open access)

Financial Leverage and the Cost of Capital

The objective of the research reported in this dissertation is to conduct an empirical test of the hypothesis that, excluding income tax effects, the cost of capital to a firm is independent of the degree of financial leverage employed by the firm. This hypothesis, set forth by Franco Modigliani and Merton Miller in 1958, represents a challenge to the traditional view on the subject, a challenge which carries implications of considerable importance in the field of finance. The challenge has led to a lengthy controversy which can ultimately be resolved only by subjecting the hypothesis to empirical test. The basis of the test was Modigliani and Miller's Proposition II, a corollary of their fundamental hypothesis. Proposition II, in effect, states that equity investors fully discount any increase in risk due to financial leverage so that there is no possibility for the firm to reduce its cost of capital by employing financial leverage. The results of the research reported in this dissertation do not support that contention. The study indicates that, if equity investors require any increase in premium for increasing financial leverage, the premium required is significantly less than that predicted by the Modigliani-Miller Proposition II, over the range of …
Date: December 1977
Creator: Brust, Melvin F.
System: The UNT Digital Library
Derivation of Probability Density Functions for the Relative Differences in the Standard and Poor's 100 Stock Index Over Various Intervals of Time (open access)

Derivation of Probability Density Functions for the Relative Differences in the Standard and Poor's 100 Stock Index Over Various Intervals of Time

In this study a two-part mixed probability density function was derived which described the relative changes in the Standard and Poor's 100 Stock Index over various intervals of time. The density function is a mixture of two different halves of normal distributions. Optimal values for the standard deviations for the two halves and the mean are given. Also, a general form of the function is given which uses linear regression models to estimate the standard deviations and the means. The density functions allow stock market participants trading index options and futures contracts on the S & P 100 Stock Index to determine probabilities of success or failure of trades involving price movements of certain magnitudes in given lengths of time.
Date: August 1988
Creator: Bunger, R. C. (Robert Charles)
System: The UNT Digital Library
Robustness of Parametric and Nonparametric Tests When Distances between Points Change on an Ordinal Measurement Scale (open access)

Robustness of Parametric and Nonparametric Tests When Distances between Points Change on an Ordinal Measurement Scale

The purpose of this research was to evaluate the effect on parametric and nonparametric tests using ordinal data when the distances between points changed on the measurement scale. The research examined the performance of Type I and Type II error rates using selected parametric and nonparametric tests.
Date: August 1994
Creator: Chen, Andrew H. (Andrew Hwa-Fen)
System: The UNT Digital Library
Call Option Premium Dynamics (open access)

Call Option Premium Dynamics

This study has a twofold purpose: to demonstrate the use of the Marquardt compromise method in estimating the unknown parameters contained in the probability call-option pricing models and to test empirically the following models: the Boness, the Black-Scholes, the Merton proportional dividend, the Ingersoll differential tax, and the Ingersoll proportional dividend and differential tax.
Date: December 1982
Creator: Chen, Jim
System: The UNT Digital Library
Comparing the Powers of Several Proposed Tests for Testing the Equality of the Means of Two Populations When Some Data Are Missing (open access)

Comparing the Powers of Several Proposed Tests for Testing the Equality of the Means of Two Populations When Some Data Are Missing

In comparing the means .of two normally distributed populations with unknown variance, two tests very often used are: the two independent sample and the paired sample t tests. There is a possible gain in the power of the significance test by using the paired sample design instead of the two independent samples design.
Date: May 1994
Creator: Dunu, Emeka Samuel
System: The UNT Digital Library
The Establishment of Helicopter Subsystem Design-to-Cost Estimates by Use of Parametric Cost Estimating Models (open access)

The Establishment of Helicopter Subsystem Design-to-Cost Estimates by Use of Parametric Cost Estimating Models

The purpose of this research was to develop parametric Design-to-Cost models for selected major subsystems of certain helicopters. This was accomplished by analyzing the relationships between historical production costs and certain design parameters which are available during the preliminary design phase of the life cycle. Several potential contributions are identified in the areas of academia, government, and industry. Application of the cost models will provide estimates beneficial to the government and DoD by allowing derivation of realistic Design-to-Cost estimates. In addition, companies in the helicopter industry will benefit by using the models for two key purposes: (1) optimizing helicopter design through cost-effective tradeoffs, and (2) justifying a proposal estimate.
Date: August 1979
Creator: Gilliland, Johnny J.
System: The UNT Digital Library
Economic Statistical Design of Inverse Gaussian Distribution Control Charts (open access)

Economic Statistical Design of Inverse Gaussian Distribution Control Charts

Statistical quality control (SQC) is one technique companies are using in the development of a Total Quality Management (TQM) culture. Shewhart control charts, a widely used SQC tool, rely on an underlying normal distribution of the data. Often data are skewed. The inverse Gaussian distribution is a probability distribution that is wellsuited to handling skewed data. This analysis develops models and a set of tools usable by practitioners for the constrained economic statistical design of control charts for inverse Gaussian distribution process centrality and process dispersion. The use of this methodology is illustrated by the design of an x-bar chart and a V chart for an inverse Gaussian distributed process.
Date: August 1990
Creator: Grayson, James M. (James Morris)
System: The UNT Digital Library
The Fixed v. Variable Sampling Interval Shewhart X-Bar Control Chart in the Presence of Positively Autocorrelated Data (open access)

The Fixed v. Variable Sampling Interval Shewhart X-Bar Control Chart in the Presence of Positively Autocorrelated Data

This study uses simulation to examine differences between fixed sampling interval (FSI) and variable sampling interval (VSI) Shewhart X-bar control charts for processes that produce positively autocorrelated data. The influence of sample size (1 and 5), autocorrelation parameter, shift in process mean, and length of time between samples is investigated by comparing average time (ATS) and average number of samples (ANSS) to produce an out of control signal for FSI and VSI Shewhart X-bar charts. These comparisons are conducted in two ways: control chart limits pre-set at ±3σ_x / √n and limits computed from the sampling process. Proper interpretation of the Shewhart X-bar chart requires the assumption that observations are statistically independent; however, process data are often autocorrelated over time. Results of this study indicate that increasing the time between samples decreases the effect of positive autocorrelation between samples. Thus, with sufficient time between samples the assumption of independence is essentially not violated. Samples of size 5 produce a faster signal than samples of size 1 with both the FSI and VSI Shewhart X-bar chart when positive autocorrelation is present. However, samples of size 5 require the same time when the data are independent, indicating that this effect is a …
Date: May 1993
Creator: Harvey, Martha M. (Martha Mattern)
System: The UNT Digital Library
Investigating the relationship between the business performance management framework and the Malcolm Baldrige National Quality Award framework. (open access)

Investigating the relationship between the business performance management framework and the Malcolm Baldrige National Quality Award framework.

The business performance management (BPM) framework helps an organization continuously adjust and successfully execute its strategies. BPM helps increase flexibility by providing managers with an early alert about changes and, as a result, allows faster response to such changes. The Malcolm Baldrige National Quality Award (MBNQA) framework provides a basis for self-assessment and a systems perspective for managing an organization's key processes for achieving business results. The MBNQA framework is a more comprehensive framework and encapsulates the underlying constructs in the BPM framework. The objectives of this dissertation are fourfold: (1) to validate the underlying relationships presented in the 2008 MBNQA framework, (2) to explore the MBNQA framework at the dimension level, and develop and test constructs measured at that level in a causal model, (3) to validate and create a common general framework for the business performance model by integrating the practitioner literature with basic theory including existing MBNQA theory, and (4) to integrate the BPM framework and the MBNQA framework into a new framework (BPM-MBNQA framework) that can guide organizations in their journey toward achieving and sustaining competitive and strategic advantages. The purpose of this study is to achieve these objectives by means of a combination of methodologies …
Date: August 2009
Creator: Hossain, Muhammad Muazzem
System: The UNT Digital Library
Developing Criteria for Extracting Principal Components and Assessing Multiple Significance Tests in Knowledge Discovery Applications (open access)

Developing Criteria for Extracting Principal Components and Assessing Multiple Significance Tests in Knowledge Discovery Applications

With advances in computer technology, organizations are able to store large amounts of data in data warehouses. There are two fundamental issues researchers must address: the dimensionality of data and the interpretation of multiple statistical tests. The first issue addressed by this research is the determination of the number of components to retain in principal components analysis. This research establishes regression, asymptotic theory, and neural network approaches for estimating mean and 95th percentile eigenvalues for implementing Horn's parallel analysis procedure for retaining components. Certain methods perform better for specific combinations of sample size and numbers of variables. The adjusted normal order statistic estimator (ANOSE), an asymptotic procedure, performs the best overall. Future research is warranted on combining methods to increase accuracy. The second issue involves interpreting multiple statistical tests. This study uses simulation to show that Parker and Rothenberg's technique using a density function with a mixture of betas to model p-values is viable for p-values from central and non-central t distributions. The simulation study shows that final estimates obtained in the proposed mixture approach reliably estimate the true proportion of the distributions associated with the null and nonnull hypotheses. Modeling the density of p-values allows for better control of …
Date: August 1999
Creator: Keeling, Kellie Bliss
System: The UNT Digital Library
The Effect of Value Co-creation and Service Quality on Customer Satisfaction and Commitment in Healthcare Management (open access)

The Effect of Value Co-creation and Service Quality on Customer Satisfaction and Commitment in Healthcare Management

Despite much interest in service quality and various other service quality measures, scholars appear to have overlooked the overall concept of quality. More specifically, previous research has yet to integrate the effect of the customer network and customer knowledge into the measurement of quality. In this work, it is posited that the evaluation of quality is based on both the delivered value from the provider as well as the value developed from the relationships among customers and between customers and providers. This research examines quality as a broad and complex issue, and uses the “Big Quality” concept within the context of routine healthcare service. The last few decades have witnessed interest and activities surrounding the subject of quality and value co-creation. These are core features of Service-Dominant (S-D) logic theory. In this theory, the customer is a collaborative partner who co-creates value with the firm. Customers create value through the strength of their relations and network, and they take a central role in value actualization as value co-creator. I propose to examine the relationship between quality and the constructs of value co-creation. As well, due to the pivotal role of the decision-making process in customer satisfaction, I will also operationalize …
Date: August 2015
Creator: Kwon, Junhyuk
System: The UNT Digital Library
A Simulation Study Comparing Various Confidence Intervals for the Mean of Voucher Populations in Accounting (open access)

A Simulation Study Comparing Various Confidence Intervals for the Mean of Voucher Populations in Accounting

This research examined the performance of three parametric methods for confidence intervals: the classical, the Bonferroni, and the bootstrap-t method, as applied to estimating the mean of voucher populations in accounting. Usually auditing populations do not follow standard models. The population for accounting audits generally is a nonstandard mixture distribution in which the audit data set contains a large number of zero values and a comparatively small number of nonzero errors. This study assumed a situation in which only overstatement errors exist. The nonzero errors were assumed to be normally, exponentially, and uniformly distributed. Five indicators of performance were used. The classical method was found to be unreliable. The Bonferroni method was conservative for all population conditions. The bootstrap-t method was excellent in terms of reliability, but the lower limit of the confidence intervals produced by this method was unstable for all population conditions. The classical method provided the shortest average width of the confidence intervals among the three methods. This study provided initial evidence as to how the parametric bootstrap-t method performs when applied to the nonstandard distribution of audit populations of line items. Further research should provide a reliable confidence interval for a wider variety of accounting populations.
Date: December 1992
Creator: Lee, Ihn Shik
System: The UNT Digital Library
Reliable Prediction Intervals and Bayesian Estimation for Demand Rates of Slow-Moving Inventory (open access)

Reliable Prediction Intervals and Bayesian Estimation for Demand Rates of Slow-Moving Inventory

Application of multisource feedback (MSF) increased dramatically and became widespread globally in the past two decades, but there was little conceptual work regarding self-other agreement and few empirical studies investigated self-other agreement in other cultural settings. This study developed a new conceptual framework of self-other agreement and used three samples to illustrate how national culture affected self-other agreement. These three samples included 428 participants from China, 818 participants from the US, and 871 participants from globally dispersed teams (GDTs). An EQS procedure and a polynomial regression procedure were used to examine whether the covariance matrices were equal across samples and whether the relationships between self-other agreement and performance would be different across cultures, respectively. The results indicated MSF could be applied to China and GDTs, but the pattern of relationships between self-other agreement and performance was different across samples, suggesting that the results found in the U.S. sample were the exception rather than rule. Demographics also affected self-other agreement disparately across perspectives and cultures, indicating self-concept was susceptible to cultural influences. The proposed framework only received partial support but showed great promise to guide future studies. This study contributed to the literature by: (a) developing a new framework of self-other …
Date: August 2007
Creator: Lindsey, Matthew Douglas
System: The UNT Digital Library
Mathematical Programming Approaches to the Three-Group Classification Problem (open access)

Mathematical Programming Approaches to the Three-Group Classification Problem

In the last twelve years there has been considerable research interest in mathematical programming approaches to the statistical classification problem, primarily because they are not based on the assumptions of the parametric methods (Fisher's linear discriminant function, Smith's quadratic discriminant function) for optimality. This dissertation focuses on the development of mathematical programming models for the three-group classification problem and examines the computational efficiency and classificatory performance of proposed and existing models. The classificatory performance of these models is compared with that of Fisher's linear discriminant function and Smith's quadratic discriminant function. Additionally, this dissertation investigates theoretical characteristics of mathematical programming models for the classification problem with three or more groups. A computationally efficient model for the three-group classification problem is developed. This model minimizes directly the number of misclassifications in the training sample. Furthermore, the classificatory performance of the proposed model is enhanced by the introduction of a two-phase algorithm. The same algorithm can be used to improve the classificatory performance of any interval-based mathematical programming model for the classification problem with three or more groups. A modification to improve the computational efficiency of an existing model is also proposed. In addition, a multiple-group extension of a mathematical programming model …
Date: August 1993
Creator: Loucopoulos, Constantine
System: The UNT Digital Library
A Model for the Efficient Investment of Temporary Funds by Corporate Money Managers (open access)

A Model for the Efficient Investment of Temporary Funds by Corporate Money Managers

In this study seventeen various relationships between yields of three-month, six-month, and twelve-month maturity negotiable CD's and U.S. Government T-Bills were analyzed to find a leading indicator of short-term interest rates. Each of the seventeen relationships was tested for correlation with actual three-, six-, and twelve-month yields from zero to twenty-six weeks in the future. Only one relationship was found to be significant as a leading indicator. This was the twelve-month yield minus the six-month yield adjusted for scale and accumulated where the result was positive. This indicator (variable nineteen in the study) was further tested for usefulness as a trend indicator by transforming it into a function consisting of +1 (when its slope was positive), 0 (when its slope was zero), and -1 (when its slope was negative). Stage II of the study consisted of constructing a computer-aided model employing variable nineteen as a forecasting device. The model accepts a week-by-week minimum cash balance forecast, and the past thirteen weeks' yields of three-, six-, and twelve-month CD's as input. The output of the model consists of a cash time availability schedule, a numerical listing of variable nineteen values, the thirteen-week history of three-, six-, and twelve-month CD yields, a …
Date: August 1974
Creator: McWilliams, Donald B., 1936-
System: The UNT Digital Library
Validation and Investigation of the Four Aspects of Cycle Regression: A New Algorithm for Extracting Cycles (open access)

Validation and Investigation of the Four Aspects of Cycle Regression: A New Algorithm for Extracting Cycles

The cycle regression analysis algorithm is the most recent addition to a group of techniques developed to detect "hidden periodicities." This dissertation investigates four major aspects of the algorithm. The objectives of this research are 1. To develop an objective method of obtaining an initial estimate of the cycle period? the present procedure of obtaining this estimate involves considerable subjective judgment; 2. To validate the algorithm's success in extracting cycles from multi-cylical data; 3. To determine if a consistent relationship exists among the smallest amplitude, the error standard deviation, and the number of replications of a cycle contained in the data; 4. To investigate the behavior of the algorithm in the predictions of major drops.
Date: December 1982
Creator: Mehta, Mayur Ravishanker
System: The UNT Digital Library
A Quantitative Approach to Medical Decision Making (open access)

A Quantitative Approach to Medical Decision Making

The purpose of this study is to develop a technique by which a physician may use a predetermined data base to derive a preliminary diagnosis for a patient with a given set of symptoms. The technique will not yield an absolute diagnosis, but rather will point the way to a set of most likely diseases upon which the physician may concentrate his efforts. There will be no reliance upon a data base compiled from poorly kept medical records with non-standardization of terminology. While this study produces a workable tool for the physician to use in the process of medical diagnosis, the ultimate responsibility for the patient's welfare must still rest with the physician.
Date: May 1975
Creator: Meredith, John W.
System: The UNT Digital Library
Classification by Neural Network and Statistical Models in Tandem: Does Integration Enhance Performance? (open access)

Classification by Neural Network and Statistical Models in Tandem: Does Integration Enhance Performance?

The major purposes of the current research are twofold. The first purpose is to present a composite approach to the general classification problem by using outputs from various parametric statistical procedures and neural networks. The second purpose is to compare several parametric and neural network models on a transportation planning related classification problem and five simulated classification problems.
Date: December 1998
Creator: Mitchell, David
System: The UNT Digital Library