## Use the robust RFCH method with a polychoric correlation matrix in structural equation modeling When you are ordinal data. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

IRAQI JOURNAL OF STATISTICAL SCIENCES | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Article 3, Volume 19, Issue 2, December 2022, Pages 24-35 PDF (673.76 K)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Document Type: Research Paper | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

DOI: 10.33899/iqjoss.2022.176201 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Authors | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Omar Ibraheem^{*} ^{1}; Mohammed Jasim Mohammed^{2}
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

^{1}Department of Informatics & Statistic, College of Computer & Mathematical Science, University of Mosul, Mosul, Iraq | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

^{2}Department of Statistics/College of Administration and Economics/University of Baghdad/ Baghdad/ Iraq | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Abstract | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Structural Equation Modeling is a statistical methodology commonly used in the social and administrative sciences and all other. In this research, the researcher made a comparison between methods of estimation Unweighted Least Squares with Mean and Variance Adjusted( ULSMV) and weighted Least Squares with Mean and Variance Adjusted (WLSMV). When we have a five-way Likert scale, the data is treated as ordinal using the polychoric matrix as inputs for the weighted methods with robust corrections. With robust standard errors ULSMV and WLSMV.No study compared these methods and the impact of outliers on them. where a robust algorithm is proposed to clean the data from the outlier, as this proposed algorithm calculates the robust correlation matrix Reweighted Fast Consistent and High Breakdown (RFCH), which consists of several steps and has been modified by taking the clean data before calculating the RFCH correlation matrix, where these data are data clean from outlier to add in the methods and to calculate a correlation matrix for each method where the purpose is to keep the ordinal data to calculate the polychoric matrix, which is robust to the violation of the assumption of normal distribution.By conducting a simulation experiment on different sample sizes and the degree of distribution to observe the accuracy of the proposed method for obtaining clean data. On methods ULSMV and WLSMV before and after the treatment process by calculating the absolute bias rate For the standard errors and the estimated parameters, in addition to studying the extent of their effect on the quality of fit indicators for each of the chi-square index, Comparative fit index (CFI), Tucker-Lewis Index (TLI), and Root-Mean-Squared-Error-of Approximation( RMSEA), Standardized Root Mean square Residual (SRMR), , with the robust corrections in the chi-square index for each of the methods WLSMV and ULSMV the accuracy of the proposed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Highlights | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

We conclude from the simulation results that all methods with robust corrections in the weighted standard errors affected by the outlier. Using the proposed method RFCH, the absolute bias rate for standard errors and parameters and all models decreases significantly, indicating the algorithm's quality to get clean of outliers and improve the quality of parameters and reduce errors. We conclude that the absolute bias rate for parameters and standard errors is affected by the degree of distribution. It is less accurate when the data is not distributed normally. Through the simulation results after using the proposed method and for the clean data, we conclude through the comparison between the methods that the best methods are the ULSMV weighted and WLSMV; when we deal with the data, it is ordinal by calculating the polychoric matrix as input, In addition to the strong corrections in the standard errors because it has the least bias rate in standard errors and the least bias in the estimate parameters. By simulating different sample sizes and with an increase in the sample size, at a contamination rate of 20%, the absolute bias rate of errors increases due to the percentage of contamination, but with the use of the proposed method RFCH, we conclude that the standard errors after cleaning and with the same sample size obtain stability, which indicates the quality of the method. Through the total quality based on the fit indexes, we conclude that all fit indexes decrease after using the proposed method and are within the limits of the ideal cut-off after cleaning. We conclude that the chi-square value is biased the sample size, as it rises with the increase in the sample size and the degree of distribution, so it is not recommended to rely on it. Through the simulation results, all the fit indexes are affected by the sample size, so we notice an increase in the accuracy of the quality of the fit indexes after using the proposed method for clean data as the sample size increases. Whereas TLI and CFI are close to one, so modeling requires a large sample size. Through the results, we conclude that the quality of fit indexes is affected by the degree of distribution. When the data are distributed in a normal distribution and free of an outlier, the fit indexes are more ideal than no normal distribution. By drawing the residual matrix for all methods, we conclude that the residuals approach zero and the normal distribution after cleaning using the proposed method. The use of the robust corrections of (Asparouhov, & Muthén,2010) in the estimation methods ULS and DWLS gave results and quality of fit greater by using correlation polychoric, especially when the data is distributed nonnormal, because of the robustness of this Correction on data that are not normally distributed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Keywords | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

a polychoric correlation matrix; outlier; robust RFCH; SEM; fit indexes; WLSMV; ULSMV | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Full Text | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Researchers and specialists have addressed various estimating approaches for structural equations. Modeling components, measurement errors, and correlation among the various factors are estimated, and the independent variables with the direct and indirect relationships connect the various independent variables. Social and behavioral research researchers use SEM, which has gained widespread appeal in the previous decades, to solve big problems. With the wide range of statistical analytic features that SEM offers, researchers may build models that account for latent variables and measurement errors. Using ML as well as other techniques, such as to estimate methods for (GLS). When certain conditions are met, possess desirable asymptotic distribution, such as unbiased, consistency, and efficiency(Gregory R. Hancock and Ralph O. Mueller 2013) Therefore, the researchers recommend addressing the problem of outlier data before using estimation methods. For this reason, a robust method has been proposed to address the problem of outlier data through the use of a proposed RFCH robust algorithm to trim the data from outlier values and the use of both methods WLSMV and ULSMV with robust corrections in standard errors and fit indexes where These robust correction methods work with data that has non normal distribution but is also sensitive to outliers The proposed algorithm for cleaning the data from the outlier and calculates a robust RFCH(Reweighted Fast Consistent and High Breakdown )matrix of an outlier, where the researcher made a simple modification to the algorithm by taking the final data he reached by going through several estimators before calculating the matrix to be hired these robust data in all methods and to calculate a polychoric correlation matrix When we deal with data ordinal.
The researcher aims to address the problem of outliers when we have a Likert scale questionnaire form, so there are responses of individuals on a paragraph more than others, in addition to errors in data entry because the modeling requires a large sample size and the entry error is very likely. Studying the effect of an outlier on estimation methods and using the same estimation methods before and after treatment using robust RFCH aims to study the effect of the sample size and the degree of distribution on the estimator bias and standard error bias and use the same estimation methods before and after treatment using robust RFCH.It aims to study the effect of the sample size and the degree of distribution on the model's overall fit indexes.
Researchers in psychological and administrative sciences often use the ML and GLS estimation method without resorting to any test because the technique requires the assumption of a normal distribution. Thus other estimation methods deal with the non normal distribution, especially when the data are ordinal, and these methods are WLSMV, ULSMV. The problem of an outlier, as the outlier values affect the estimation of parameters, standard errors, and the fit indexes, although there are methods that deal with no normal distribution, the methods are not Robust for outlier values, so they require treatment before using the method of estimation by using robust method RFCH. When we have a Likert scale of five categories. We use new methods and corrections when we treat the data as ordinal using the polychoric matrix.
An important two-part of models employed in SEM includes measurement models and structure models. . CFA is used to correct for indicator measurement error, shaping the latent variables (factors). A model in which the exogenous variable x and the endogenous variables y are being measured is defined as (1) The full structural Equation model is defined as (2) The covariance matrix is obtained as follows by (3) Therefore the matrix of covariance was proven. (Timm 2002)(Byrne 2013)(Bollen 1989).
Polychoric correlation, explained by (Olsson 1979) can be calculated when ordinal data is involved. Ordinal variable and ordinal variable have distinct and class categories. Usually, using the two-stage method, polychoric correlations computed by Olsson (1979) defined. The proportions of data for the category of an ordinal univariate variable are utilized independently in the first phase to approximate each latent univariate response variable's threshold values. gives both variables ordinal y1, with denotes, and ordinal y2, with b i,j=0,…,r The first step is to set the thresholds at the estimated value of r and s. (4) And (5) The univariate standard normal for cumulative distribution function is denoted as F1, and
When data are non normal, the most generally advocated estimate strategy is the asymptotically distribution-free (ADF) system (Browne, 1984). When continuous and ordinal data stray greatly from normality, the use of this method is allowed. ' In the general situation, θ is the ADF estimator under the following GLS method: the vector that minimizes this function is (6) the stochastic weight matrix has a positive definite vector structure. can be written WLS minimizes the fit function . (Muthén and Asparouhov 2002) (DiStefano 2002)
The estimate of Diagonally WLS (DWLS) was developed to address the limitations of the full estimate of the WLS. Specifically, by decreasing the statistical sensitivity associated with the complete WLS estimator, DWLS eliminates the need for a large sample size DWLS may also incorporate scaling similar to the SB scaling approach, resulting in robust DWLS estimation(Gregory R. Hancock and Ralph O. Mueller 2013). The general form of the RWLS fit function is: (7) In ordinary data, one technique fits the SEM model with the polychoric correlation matrix rather than the sample covariance matrix called cat-DWLS. includes only diagonal elements of a polychoric corelation , and threshold projections approximate asymptotic covariance matrix. (Bollen, 1989; Muthén & Muthén, 2010). However, the typical test statistics TWLS are not sufficient for model fit evaluation because the test statistics provided by cat-DWLS are no longer distributed asymptotically chi-square. This robust correction requires both corrections. The mean-adjusted chi-square statistic can also be implemented in the cat-DWLS estimator (Asparouhov and Muth 2010) proposed a new way to compute the mean- and variance-adjusted (denoted as TDWLS-MV). The method of estimating this correction is called WLSMV: developed ways to compute the robust test (8) Where , and Is the estimated asymptotic covariance matrix of s, = the number of unique elements in s, and = the number of independent model parameters. The method of estimating this correction is called WLSMV(Weighted Least Squares with Mean and Variance Adjusted).(Jia 2016)(Muthén 2002)
The ULS approach is simply a type of OLS estimation that minimizes the total squared differences between the sample and the covariance's expected by the model. This can obtain unbiased estimates through random samples. A downside of the ULS approach is the necessity that all variables observed be on the same scale. One benefit is that the ULS approach does not need a positive-definite covariance matrix, including ML(Kline 2016) estimation method does not require distributional assumption(Nalbantoğlu Yılmaz 2019) The cat-ULS parameter estimates a saturated threshold structure by minimizing the fit can be represented as follows (9) Where represent polycoric corelation matrix. . (Savalei and Rhemtulla 2013) A recent proposal by (Asparouhov and Muth 2010) to implement an amendment in the second-order that does not modify the degree of the freedom. The Cat-ULS estimator determines the next method for the new mean and variance-adjusted statistics: ULSMV ( 10) Where , , Represent , , It is a standard matrix of . 'These statistics are similar to the chi-square scaled by the so-called Satorra – Bentler, famous for continuous results. This applies to a chi-square distribution of df degrees of freedom, but that is just the approximate asymptomatic distribution. (Savalei and Rhemtulla 2013) (Xia and Yang 2018)(Asparouhov and Muth 2010)
Olive and Hawkins (2010) developed Reweighted Fast Consistent and High breakdown (RFCH) estimators of location and scatter, which was faster than the fast MCD developed by Rousseeuw and Driessen (1999). The attractive feature of the technique is that not only its computation is very fast, which is even faster than Fast MCD (Zhang et al., 2012), but it is Consistent estimators. The utilizes the Consistent DGK (Devlin et al., 1981 ) estimator and high breakdown Median Ball (MB) (Olive & Hawkins, 2008 ) estimators as attractors. Mahalanobis (1936) defined Mahalanobis Distance (MD) to measure the deviation of a data point from its center. Let us write the vector of predictor variables as:
where Is a -dimensional row vector. The mean vector and the variance-covariance matrix are calculated as:
Subsequently, the for each observation is written as Equation: (11) where is the mean vector and is the variance-covariance matrix .
(12) And Equation (12) (13) (14) with the new cut-off point until convergence to get the final attractors and , Subsequently, the Mahalanobis Distance based on is computed, and a new set of data is constructed using the following Equation (15) ; (15)
The robust chi-square statistic, model degrees of freedom, scale factor, and shift factor for WLSMV and ULSMV is denoted as T, d, a, and b, respectively. PR model-fit indexes are determined for a sample size of n. (16) (17) (18) (Xia and Yang 2019)(Savalei 2018)(Asparouhov and Muth 2010)
Residual matrix To examine the hypothesis that Σ = Σ(θ) you must calculate Σ−Σ(θ). A nonzero member in a null matrix indicates model definition error. To find S, you would use Σ(θ) as a substitution for Σ, and then you would use S − Σ(θ) to form S-Σ(θ) has elements, where each element is calculated as S − Σ(θ). Each parameter determines whether the model predicts covariance levels between observed variables and in the negative or positive definite. the correlation residuals(Hildreth 2013) (Ibrahim and Mohammed 2021) (19)
This formula is known as the "Root Mean Square Residual" (SRMR). Dr. Stephen Bentler created SRMR in 1995.SRMR is calculated the sample estimate and population is follows: (20) Where s = k(k + 1)/2). And are elements of , and Respectively. Represent is the sample covariances, Is the model implied covariances, and
The simulation was conducted to answer the research objectives and problems of the research. The simulation design, data generation and analysis procedures and evaluation of the results will be described. Continuous data were generated using the R program according to the method of(Vale and Maurelli 1983) and(Rhemtulla, Brosseau-Liard, and Savalei 2012) for a multivariate normal distribution with skewness and kurtosis of 0 and 0 and a distribution of moderate normal with skewness and kurtosis 2 and 7, and the number of variables required for the variance-covariance matrix as defined in the model, and then a set of thresholds are determined to convert each continuous variable into an ordered categorical variable, as the number of categories is equal to 5, and this is common in research. It is Generating data with different sample sizes and 500 replicates for each group with 20% contamination average for each sample size, randomly, where the proposed modified robust system is applied to clean the data from an outlier. The following Table shows the design of the simulation experiment for the model, sample sizes, and distributions.
The first model consists of four factors and 12 variables; each factor has three variables. We have three exogenous factors and one endogenous factor, and the indicators are loaded on the first three factors at 0.70. with making the indicators for one factor, they are generated random normality, with a mean equal to 0.5 and standard deviation 0.05, the scheme The following describes the design of a simulation experiment for a model
As for the simulation model, it was designed as follows
where load factors for X and Z, respectively
As for , it represents the correlation between the exogenous latent variables, as the correlation with the value 0.2 is shown in the matrix below
Also, the covariance matrix represents the measurement error, or the variance of the residuals on the independent and dependent variables (indicators), which equals 1. In contrast, the covariance matrix of reflects the correlations or variances of the factors located on the latent variables.
Whereas the matrix represents the paths between the exogenous and endogenous latent variables so that these paths were generated with a multivariate normal distribution with a mean equal to 0.3 and standard deviation of 0. 5
The model consists of two parts, measurement the model, which is represented by the following mathematical equations (21) As for the structural model, it is written in the following format ( 22) The parameters ... , are unknown, and their estimation is required. The factor loads of the standard model, the measurement errors on the measured variable, and the structural model parameters represent a path analysis between the underlying variables.
To determine the overall fit of the standard errors of the parameters, the total absolute bias average of the standard errors was calculated as shown in tables (1 ), which represents the bias for both factor loading, structural coefficients, correlations, influence by two methods estimation with the presence of outlier values and using the proposed method RFCH and according to the distribution normality and moderate distribution non-normality, as it was noted that the relative bias of errors decreased in all sample sizes and all methods, which indicates the quality of the proposed method to clean the data from an outlier in addition to the effect of an outlier on standard errors.
The (C) symbol is represented in front of each method using the clean data of the proposed method RFCH represent CULSMV and CWLSMV.As for the methods that deal with the data as ordinal by calculating the polychoric matrix in addition to using the robust corrections in the standard errors and the robust corrections in the chi-square, the values of the absolute bias average for the method of ULSMV before cleaning ranged between 0.4462- 0.4227, while CULSMV after using the method The proposed ranged between 0.16983 - 0.12016, and from this result. It is clear from this result that there is a clear difference using the RFCH method, as the errors were very small and less than using the WLSMV method directly with contaminated data. By comparing the two methods, it is clear that both methods are ideal in terms of the relative bias of the standard errors of the clean data, and they give close results. And in some sample sizes, the WLSMV method is superior, and in other sample sizes, the ULSMV method is superior.
The total quality of the estimated parameters was calculated by calculating the absolute bias average for the parameters before and after cleaning with the presence of outlier using the proposed method, as it was noted through Table ( 2) that all the parameters estimated using the robust RFCH were very small compared to the contaminated data and for all methods.
It was noted through the data that was normality generated with outlier and cleaned using the proposed method that the overall bias average of the parameters is much smaller than the data that were assumed by the nonnormal distribution so that the performance of the robust and weighted methods without outlier is better for both two distributions, in addition to the evaluation of the model through the relative bias average For standard errors and estimated parameters, the quality of the proposed method is evaluated after cleaning from the outlier through the residual matrix, which represents the difference between the real parameter and the estimated parameter
Through the simulation results of the previous model, the data follow the two distributions of first: skewness 2, kurtosis 7, and second: skewness 0, kurtosis 0, in the presence of an outlier. They are cleaned by the proposed method RFCH from outlier and use five sample sizes: 200, 400, 600, 800, and 1000. as well as it was noted that the fit indexes of differing according to the estimated method Because some methods use the correction robust chi-square, in addition, some fit indicators are based on the chi-square correction robust.
In comparison, its value was less when the nonnormal data is distributed, which indicates the robustness of the correction to deal with nonnormal data. And all chi values decreased after using the proposed method RFCH as shown in Table (3).
For the ULSMV and WLSMV methods, the use of (Asparouhov, & Muthén, 2010) correction robust method for mean and variance, especially when the distribution assumption is violated and in the presence of outliers It gave better results and noted that the chi-square index is biased for the sample size, the model size and affected by the degree of distribution, so other matching indicators have been developed based on the chi -Square and the immune-corrected chi-Square robust, even though the process of cleaning from the outlier made all the values of chi-Square and for all methods close.
This is the most fitting indicator based on the estimation technique; It was noticed through Table ( 4) when using the proposed method and for all sample sizes that the RMSEA values had decreased and became within the ideal limits close to zero, and it was also noted that the value of RMSEA with the increase in the sample size approached to zero using the RFCH method and that the use of robust corrections for chi Square in the RMSEA index gave better results
For the ULSMV and WLSMV estimation methods, we note that the fit index values are very small before and after cleaning and that they are smaller than the fit indicators for other methods and all sample sizes. We also note that the ULSMV method is superior to the methods by giving it a relatively lower value than the WLSMV method. This is if the data does not follow a normal distribution. But if the data follow a normal distribution, then through the table results, it was noted that the values of ULSMV for the clean data ranged between 0.004306-0.00996, while the WLSMV method ranged between 0.009658-0.004389.
This fit index is less affected by the chi-square determinants, which is an index of the covariance matrix of the residuals, and the closer to zero indicates that there is no error and that the recommended minimum is 0.08.
The results are shown in Table (5 ) for all methods, whether normal or nonnormal distribution, the value of SRMR falls within the ideal limits. However, some methods such as WLSMV and ULSMV before cleaning also fall within the acceptable limits for the use of robust corrections in errors, as noted through The Table shows that these methods have the lowest SRMR compared to other methods when we treat the data as ordinal, where the ULSMV values for the nonnormal distribution ranged between 0.04853- 0.021946 for the clean data, which indicates a perfect fit for the residuals of standard errors, which represent the difference between the sample matrix for the real data and the estimated matrix from the model, while the values of the WLSMV method ranged between 0.04853-0.021973, as it was noted that with the increase in the sample size and for all methods after cleaning, it approaches more than zero and the least error for the residuals.
These fit indicators the high value indicates a perfect fit through the results in the tables (6 ) ( 7) for all methods and all sample sizes and the two distributions, and the values of the two fit indicators lead to the rejection of the model when the data contains outlier values for most methods. At the same time, the values after using the proposed method RFCH obtained an ideal fit quality and were close to one. However, most methods after cleaning give very close results, especially when the data is normally distributed. We conclude from That is, with the increase in the sample size, increase the accuracy and robustness of fit indexes, as shown in the tables.
Table (7 ) the TLI fit index values of the small model
In addition, the TLI and CFI fit indicators for the normal distribution, whether for contaminated data and clean data, after using the proposed method give greater results than if the data distribution is no normal.
We conclude from the simulation results that all methods with robust corrections in the weighted standard errors affected by the outlier. Using the proposed method RFCH, the absolute bias rate for standard errors and parameters and all models decreases significantly, indicating the algorithm's quality to get clean of outliers and improve the quality of parameters and reduce errors. We conclude that the absolute bias rate for parameters and standard errors is affected by the degree of distribution. It is less accurate when the data is not distributed normally. Through the simulation results after using the proposed method and for the clean data, we conclude through the comparison between the methods that the best methods are the ULSMV weighted and WLSMV; when we deal with the data, it is ordinal by calculating the polychoric matrix as input, In addition to the strong corrections in the standard errors because it has the least bias rate in standard errors and the least bias in the estimate parameters. By simulating different sample sizes and with an increase in the sample size, at a contamination rate of 20%, the absolute bias rate of errors increases due to the percentage of contamination, but with the use of the proposed method RFCH, we conclude that the standard errors after cleaning and with the same sample size obtain stability, which indicates the quality of the method. Through the total quality based on the fit indexes, we conclude that all fit indexes decrease after using the proposed method and are within the limits of the ideal cut-off after cleaning. We conclude that the chi-square value is biased the sample size, as it rises with the increase in the sample size and the degree of distribution, so it is not recommended to rely on it. Through the simulation results, all the fit indexes are affected by the sample size, so we notice an increase in the accuracy of the quality of the fit indexes after using the proposed method for clean data as the sample size increases. Whereas TLI and CFI are close to one, so modeling requires a large sample size. Through the results, we conclude that the quality of fit indexes is affected by the degree of distribution. When the data are distributed in a normal distribution and free of an outlier, the fit indexes are more ideal than no normal distribution. By drawing the residual matrix for all methods, we conclude that the residuals approach zero and the normal distribution after cleaning using the proposed method. The use of the robust corrections of (Asparouhov, & Muthén,2010) in the estimation methods ULS and DWLS gave results and quality of fit greater by using correlation polychoric, especially when the data is distributed nonnormal, because of the robustness of this Correction on data that are not normally distributed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

References | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

References
Asparouhov, Tihomir, and Bengt Muth. 2010. “Simple Second Order Chi-Square Correction.” 1–8.
Bollen, Kenneth A. 1989. Structural Equations with Latent Variables. John Wiley & Sons.
Byrne, Barbara M. 2013. Structural Equation Modeling with Mplus: Basic Concepts, Applications, and Programming. New York london: routledge.
DiStefano, Christine. 2002. “The Impact of Categorization With Confirmatory Factor Analysis.” Structural Equation Modeling 9(3):327–46.
Flora, David B., and Patrick J. Curran. 2004. “An Empirical Evaluation of Alternative Methods of Estimation for Confirmatory Factor Analysis with Ordinal Data.” Psychological Methods 9(4):466–91.
Gregory R. Hancock and Ralph O. Mueller. 2013. Structural Equation Modeling: A Second Course (2nd Ed.). United States of America: Iap.
Hildreth, Laura. 2013. “Residual Analysis for Structural Equation Modeling.” Statistics PhD.
Ibrahim, Omar Salim, and Mohammed Jasim Mohammed. 2021. “A Proposed Method for Cleaning Data from Outlier Values Using the Robust Rfch Method in Structural Equation Modeling.” International Journal of Nonlinear Analysis and Applications 12(2):2269–93.
Jia, Fan. 2016. “Methods for Handling Missing Non-Normal Data in Structural Equation Modeling.” Doctoral Dissertation, University of Kansas 111.
Kline, R. B. 2016. Principles and Practices of Structural Equation Modelling. 4th ed. The Guilford Press.
Muthén, Bengt O. 2002. Mplus Technical Appendices. Retrieved from http://www.statmodel.com/download/techappen. pdf.
Muthén, Bengt O., and Tihomir Asparouhov. 2002. “Latent Variable Analysis With Categorical Outcomes: Multiple-Group And Growth Modeling In Mplus.” Mplus Web Notes: 5(4):1–22.
Nalbantoğlu Yılmaz, Funda. 2019. “Comparison of Different Estimation Methods Used in Confirmatory Factor Analyses in Non-Normal Data: A Monte Carlo Study.” International Online Journal of Educational Sciences 11(4):131–40.
Olive, David J. 2017. Robust Multivariate Analysis. USA: Springer.
Olive, David J., and Douglas M. Hawkins. 2008. “High Breakdown Multivariate Estimators.” 1–29.
Olive, DJ, and DM Hawkins. 2010. “Robust Multivariate Location and Dispersion.” Unpublished Manuscript Available From (Http://Www. Math. Siu. Edu/Olive/ Pphbmld.Pdf) 1–30.
Olsson, Ulf. 1979. “Maximum Likelihood Estimation of the Polychoric Correlation Coefficient.” Psychometrika 44(4):443–60.
Rhemtulla, Mijke, Patricia É. Brosseau-Liard, and Victoria Savalei. 2012. “When Can Categorical Variables Be Treated as Continuous? A Comparison of Robust Continuous and Categorical SEM Estimation Methods under Suboptimal Conditions.” Psychological Methods 17(3):354–73.
Rousseeuw, Peter J., and Katrien Van Driessen. 1999. “A Fast Algorithm for the Minimum Covariance Determinant Estimator.” Technometrics 41(3):212–23.
Savalei, Victoria. 2018. “On the Computation of the RMSEA and CFI from the Mean-And-Variance Corrected Test Statistic with Nonnormal Data in SEM.” Multivariate Behavioral Research 53(3):419–29.
Savalei, Victoria, and Mijke Rhemtulla. 2013. “The Performance of Robust Test Statistics with Categorical Data.” British Journal of Mathematical and Statistical Psychology 66(2):201–23.
Schermelleh-Engel, Karin, Helfried Moosbrugger, and Hans Müller. 2003. “Evaluating the Fit of Structural Equation Models: Tests of Significance and Descriptive Goodness-of-Fit Measures.” Methods of Psychological Research Online 8(2):23–74.
Timm, Neil H. 2002. Applied Multivariate Analysis. Verlag New York, Inc: Springer,New York Berlin Heidelberg.
Uraibi, Hassan S., and Habshah Midi. 2019. “On Robust Bivariate and Multivariate Correlation Coefficient.” Economic Computation and Economic Cybernetics Studies and Research 53(2):221–39.
Vale, C. David, and Vincent A. Maurelli. 1983. “Simulating Multivariate Nonnormal Distributions.” Psychometrika 48(3):465–71.
Xia, Yan, and Yanyun Yang. 2018. “The Influence of Number of Categories and Threshold Values on Fit Indices in Structural Equation Modeling with Ordered Categorical Data.” Multivariate Behavioral Research 53(5):731–55.
Xia, Yan, and Yanyun Yang. 2019. “RMSEA, CFI, and TLI in Structural Equation Modeling with Ordered Categorical Data: The Story They Tell Depends on the Estimation Methods.” Behavior Research Methods 51(1):409–28.
Yang-Wallentin, Fan, Karl G. Jöreskog, and Hao Luo. 2010. “Confirmatory Factor Analysis of Ordinal Variables with Misspecified Models.” Structural Equation Modeling 17(3):392–423.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Statistics Article View: 275 PDF Download: 298 |