IRAQI JOURNAL OF STATISTICAL SCIENCES
https://stats.mosuljournals.com/
IRAQI JOURNAL OF STATISTICAL SCIENCESendaily1Sat, 01 Jun 2024 00:00:00 +0430Sat, 01 Jun 2024 00:00:00 +0430Performance of some Yang and Chang estimators in Logistic Regression Model.
https://stats.mosuljournals.com/article_183228.html
In logistic regression models, the maximum likelihood (ML) method is always one of the commonly used to estimate the model parameters. However, unstable parameter estimates are obtained as a result due to the problem of multicollinearity and the mean square error (MSE) gotten cannot also be relied on. Several biased estimators has been proposed to handle the issue of multicollinearity and the logistic Yang and Chang estimator (LYC) is one of them. Likewise research has also made us to understand that the biasing parameter has effect too on the value of the MSE. In this paper we proposed seven LYC biasing estimators and they were all subjected to Monte Carlo simulation studies and Pena data set was also used too. The result from the simulation study shows that LYC estimators outperforms the Logistic Ridge Regression (LRR) and the ML approach. Furthermore, application to Pena real data set also conform to the simulation resultsThe Implications of Discriminant Analysis Function in Classifying the Obesity of Childhood < 15 in Egypt “An Applied Study on the Data of EDHS 2015"
https://stats.mosuljournals.com/article_183229.html
In general, relatively few statistical studies have been published on the classification of obesity as a risk-based thinking in the childhood stage among the children &lt;15 years old in Egypt. Furthermore, the obesity is regarded as a critical risk that may impede the progress toward the desired level of human development due to its comorbidities and chronic diseases, particularly among children, which may persist with them from childhood to adulthood and elderhood without any tangible measures taken by the officials of state to effectively control the severity of this risk. For that reason, the research direction of this paper examines the impact of some key determinants available in the 2015 Egyptian Demographic Health Survey data (EDHS) to prove its statistically significant effected on the classification of the Egyptian child weight (obesity or non-obesity) based on exceeding the risk threshold 29.9 of the Body Mass Index (BMI) to classify as a confirmed obesity case. Thus, it should be focused on it by the planners to support the children of this selected age group (0-14) until they become adults in the adulthood stage.This study applied discriminant analysis to extract a statistical technique of classifying the status of children weight, thus a new discriminant function was derived and capable of classifying the BMI of Egyptian children &lt;15 by 65% into two main groups: obesity or non-obesity based on seven variables represented, the age, educational level, years of schooling, gender, circumcision status, residency, breastfeeding, respectively according to the strength of the correlation coefficient of each variable in the proposed model. The recommendations of this study were divided into health &amp;&nbsp;environmental, social, and economic work groups that may contribute to decision-making in order to ensure the sustainability of the Egyptian child's health through an ideal weight that will be allowed their effective contribution within the production wheel, progress and shaping the future for the next generationsRobust Weighted Least Squares Method using different schemes of M-estimators (RWLSM), A comparative Study
https://stats.mosuljournals.com/article_183230.html
&nbsp;&nbsp;&nbsp;&nbsp; In this research, it was reducing or excluding the effect of not satisfying the assumption of normal distribution of the data, due to the presence of types of outlying values in it when we wish to choose the best regression equation by robust methods, and this was achieved by introducing weights from the robust methods in the estimate and testing their robustness and suitability for the model in advance, And then selecting the weights resulting from the highest efficient robust methods and introducing these weights in the stages of selecting best regression equation, which results in a model that achieves two characteristics at the same time, which are robustness and reducing dimensions in return for increasing efficiency.The simulation approach was used on models with different dimensions, different sample sizes, and different contamination rates in the dependent variable once, in the independent variables again, and in both together, with a focus on studying the possible impact of the presence of outliers on the variables that will remain in the model and the variables that will be deleted. .To achieve the idea of the paper, a number of robust estimation methods were studied, and the results were compared with the ordinary least squares method (OLS) and the robust adaptive LASSO method on experimental data using simulation, as well as on data for a sample of thalassemia patients in Nineveh province..&nbsp;&nbsp;Classification of Circular Mass of Breast Cancer Using Artificial Neural Network vs. Discriminant Analysis in Medical Image Processing
https://stats.mosuljournals.com/article_183231.html
In recent years, there has been a notable increase in interest regarding intelligent classification techniques rooted in Machine Learning within the domain of medical science. Specifically, machine learning, a pivotal area of artificial intelligence, has been extensively utilized to aid medical professionals in predicting and diagnosing various diseases. This study applies two distinct machine learning algorithms to address a medical diagnosis concern related to circular masses in breast cancer. The dataset encompasses 150 cases of breast cancer patients. The primary objective is to assess and compare the effectiveness of artificial neural networks (ANNs) and linear discriminant analysis (LDA) classifiers based on key criteria: accuracy, sensitivity, specificity, and the kappa coefficient in predicting circular masses within breast cancer. Results indicate that the performance of the ANN classifier surpasses that of the LDA model, achieving an accuracy of 97.7%, sensitivity of 95%, specificity of 100%, and a kappa coefficient of 95.31%. Additionally, the final fitted models unveil the pivotal factors significantly influencing circular masses in breast cancer, highlighting Solidity and Entropy as the most critical variables.&nbsp;&nbsp;&nbsp;Reliability Estimation of the Lomax Distribution under Ranked Set Sampling (RSS) With Application
https://stats.mosuljournals.com/article_183232.html
Sometimes the researcher faces a problem in obtaining data, or there may be difficulty in obtaining it due to cost, effort, or other reasons related to the time. In this case, sampling methods are used that ensure the researcher achieves his desired goal with a short time, effort, and cost, by using Ranked Set Sampling (RSS). In this paper, the reliability function of the Lomax Distribution was estimated under the (RSS) using four estimation methods, which are the Maximum Likelihood Estimators (MLE), the Maximum Product of Spacings (MPS), the Least Squares (LS) method, and Weighted Least Squares (WLS). The Monte Carlo simulation method was also used to determine the best method, and the best estimate was chosen using the Mean Square Error (MSE) criterion, and the results were applied in the theoretical aspect using the R-program, as the simulation results showed that the most efficient method among the methods used to estimate the reliability function of the Lomax distribution under (RSS) is the (MPS) method. The experimental aspect was applied to real data representing the times of the beginning of complete recovery (times of disease remission) in months for bladder cancer patients for a sample consisting of 96 patients drawn using the (RSS) method.A Parametric Regression Model using Power Chris-Jerry Distribution with Application to Censored Data
https://stats.mosuljournals.com/article_183249.html
&nbsp;&nbsp;&nbsp; The interdependency of various areas of Statistics is gaining good attention in the literature. This is possible through innovations such as the log-transformation of distribution to a parametric regression model hence integrating contemporary probability distribution models with the classical regression method. The new model is essentially preferred due to its applicability in wider scenarios. In this article, the base distribution is the Power Chris-Jerry distribution. The reparametrized regression model equivalent was carefully derived with the maximum estimation procedure under censored sample also considered. A simulation study of the log-power Chris-Jerry regression model was carried out with measures of performance presented. The COVID-19 patient lifetime censored data was deployed to justify the motivation for developing the new model and twelve competing models were used to compare the proposed regression. The results show that the proposed model is indeed better and preferred to its competitors.Estimating the Transitional Probabilities of the E.Coli Gene Chain by Maximum Likelihood Method and Bayes Method
https://stats.mosuljournals.com/article_183250.html
The transition matrix estimators of the Markov chain are not accurate and the transition matrix is considered given. There are many methods that are used to estimate the transition probabilities matrix for different cases, the most famous of which is the Maximum Likelihood Method, In order to find a good estimator for the transition probabilities matrix of the Markov chain, a Bayes method and &nbsp;a Proposed Method was used in this paper, to reach the transition probabilities with the least variance, The Escherichia Coli (E.Coli) gene chain was chosen as an applied aspect of the study due to its importance in medical research and for the purpose of discovering and manufacturing treatments by knowing the final form of its gene chain. After testing the E.Coli gene chain, it was found that is represents a Markov chain, and then both the transition probabilities matrix and the transition probabilities variance were estimated used Proposed Method and Bayes method and Maximum Likelihood Method, and it was found that the Proposed Method for transitional probabilities is better than the Bayes method and Maximum Likelihood Method dependence on the variance.&nbsp;Improving generalized ridge estimator for the gamma regression model.
https://stats.mosuljournals.com/article_183251.html
It has been consistently proven that the ridge estimator is an effective shrinking strategy for reducing the effects of multicollinearity. An effective model to use when the response variable is positively skewed is the Gamma Regression Model (GRM). &nbsp;However, it is well known that the existence of multicollinearity can have a detrimental impact on the variance of the maximum likelihood estimator (MLE) of the gamma regression coefficients. The generalized ridge estimator is suggested in this study as a solution to the ridge estimator's limitation. The shrinkage matrix has been estimated using a number of different techniques. Our Monte Carlo simulation and actual data application findings indicate that the suggested estimator, regardless of the kind of estimating method of shrinkage matrix, is superior to the MLE and ridge estimator in terms of Mean Square Error (MSE). Additionally, compared to other methods, some shrinkage matrix estimation techniques can significantly enhance results.Study of Alpha Power Weibull Distribution with Application
https://stats.mosuljournals.com/article_183252.html
&nbsp;&nbsp;&nbsp;&nbsp; The Gull Alpha power transform (GAPT) is a family which introduces a single additional parameter to classical continuous probability distribution.The parameter increases the flexibility of classical distribution. This&nbsp; family is used to make the standard Weibull distribution which is called a Gull Alpha Power transform&nbsp; Weibull (GAPTW) distribution more flexible in application&nbsp; .Some statistical properties of GAPTW distribution , median ,quartiles ,moments and mode have been discussed. Maximum Product of Spacing (MPS) and Cramer _Von Mises (CVM) method of estimation is used to estimate the parameters of the distribution .It is concluded that the GAPTW is a heavier tailed than weibull distribution and its (hr) function can be increasing,&nbsp; decreasing or bathtube when the shape parameters is defined&nbsp; on sub intervals of their sample space. From the application side, the GAPTW&nbsp; distribution has a flexible application in medical field , specifically&nbsp; in modeling cancer data and an&nbsp; engineering field .&nbsp;&nbsp;&nbsp;A survey on E- payment systems
https://stats.mosuljournals.com/article_183253.html
significant impact on global trade in the modern era. This has given rise to a new industry known as "electronic commerce," or "E-commerce," which is the use of smart devices and the internet for the exchange and display of goods and products. The need for a means of transferring money between people and businesses globally led to the creation of the so-called electronic payment, which is the foundation of electronic commerce and involves using computers and other smart devices to transfer money between the bank accounts of the relevant parties. This paper reviewed the most important electronic payment methods currently available and the security features of each method, as the adoption of any method depends on the availability of service by the responsible authorities in a particular region and the living level of that region. But, in general, mobile wallet applications that use NFC technology are considered one of the most adopted methods worldwide due to their ease of use and the availability of NFC service in most modern smartphones.Estimating the Parameters of Mixture Gamma Distributions Using Maximum Likelihood and Bayesian Method
https://stats.mosuljournals.com/article_183254.html
&nbsp;&nbsp;&nbsp;&nbsp; This paper focuses on the mixture Gamma distribution and uses the maximum likelihood and Bayesian techniques to estimate its parameters. This study uses Expectation Maximization Algorithm (EM) to find the maximum likelihood estimators and the random Metropolis-Hastings algorithm is used to simulate the Bayesian estimates of the parameters of mixture gamma distribution. then these estimates are compared by using the sum of the modulus of the&nbsp; bias (MBias), and&nbsp; the&nbsp; root-mean square error (RMSE). It has&nbsp; been shown that the Bayesian estimator is better than the maximum likelihood estimator.&nbsp;Variable selection in Logistic regression model using modified firefly algorithms
https://stats.mosuljournals.com/article_183255.html
&nbsp;Abstract: The logistic regression model is considered the most widely used in many applications, and it is one of the main models in the family of generalized linear models. Like other regression models, the model may contain many independent variables, which negatively affects the accuracy of the model and its simplicity in interpreting the results. This study aims to use the modified firefly algorithm and compare it with other methods for selecting variables in an exponential regression model using simulation and real data. The results showed that compared to other previously used methods, the proposed method performs better and helps reduce the mean square error of the model.&nbsp;&nbsp;.Construction of the Daubechies Wavelet Chart for Quality Control of the Single Value
https://stats.mosuljournals.com/article_183257.html
In this paper, it was proposed to create a new quality control chart for the single value of the Daubechies wavelet to address the problem of data contamination before creating the single value chart and comparing it with the classical Shewhart single value chart. The proposed charts are an application of wavelet shrinkage and a universal threshold estimation method with a soft threshold rule to address the contamination problem and de-noise data to obtain a more efficient chart in controlling the single value and increasing the sensitivity of the chart in detecting minor changes that may occur in the production process. Accordingly, a single chart for the Daubechies wavelet with orders (1, 2, and 3) was proposed and applied to randomly generated data (simulated) for several cases, then real data and some efficiency indicators of the proposed charts were calculated and compared with the classical chart based on the MATLAB program. The paper results showed that all proposed charts were better than the classical chart for single values for all simulation cases and real data.Classification of Diabetes Data Set from Iraq via Different Machine Learning Techniques
https://stats.mosuljournals.com/article_183258.html
&nbsp;&nbsp;&nbsp; Diabetes has become one of the most prevalent diseases in Iraq and is listed as one of the leading causes of death. Machine learning provides effective information extraction results by creating predictive models from diagnostic medical datasets collected from diabetes patients in Iraq.&nbsp;&nbsp;&nbsp;&nbsp; In this study, we applied machine learning classification to compare and contrast the performances of classification and regression trees (CART), support vector machines (SVM), random forests (RF), linear discrimination analysis (LDA), and K-nearest neighbors (KNN). We sought to design a model that can predict with maximum accuracy the probability that a person has, is healthy, or is expected to develop diabetes in the future using the two scales of accuracy and kappa.&nbsp;&nbsp;&nbsp;&nbsp; Based on the results obtained from the algorithms, it showed that the accuracy and sequence of the algorithms concerning the training data were Random Forest (RF), Classification and Regression Trees (CART), Support Vector Machine (SVM), Linear Discrimination Analysis (LDA), and K-Nearest Neighbors (KNN). While the test data results showed some differences, the sequence of the algorithms was as follows: SVM, RF, CART, LDA, and KNN were the highest, respectively. The training data set refers to the samples that were used to construct the model, whereas the testing data set is used to evaluate the model's performance.&nbsp;&nbsp;&nbsp;&nbsp; Based on the assessment criteria discussed above, we chose the best machine learning approach to predict diabetes mellitus in Iraq to achieve high performance. All of the strategies listed above are approximated using a supervised diabetes testing dataset. The approach that achieves the maximum performance in terms of accuracy and kappa is regarded as the best option. Based on the results, it can be seen that the SVM and RF algorithms predicted diabetes with more accuracy.&nbsp;Improved Mixed Estimator Using Two Auxiliary Variables For Full Extreme Maximum And Minimum Values In Single Phase Sampling
https://stats.mosuljournals.com/article_183259.html
The use of multiple auxiliary variables has been established to improve precision in the estimators of ratio, regression and product respectively. However, the presence of extreme values in the distribution could annul such efficiency Olatayo et al. (2020). Extreme values could be small or minimum, large or maximum values. This study had developed a ratio-cum-regression estimator with two auxiliary variables, correlation coefficient and coefficient of variation under two types of extreme values in the distribution. This study considers full extreme value cases which assumed that both the study and two auxiliary variables had extreme values present in their distributions. Theoretical, empirical and percentage relative efficiency analyses were carried out for Full High and Maximum Extreme Values (FHMaEV) and Full Low and Minimum Extreme Values cases (FLMiEV). The analysis showed that the developed estimator is efficient over the reviewed estimators.