An Approach for Solving Missing Values in Data Set Using Clustering-Curve Fitting Technique | ||
Journal of Kufa for Mathematics and Computer | ||
Article 1, Volume 2, Issue 2, August 2015, Pages 81-99 | ||
Authors | ||
Kadhim AlJanabi; Mansoor Habeebi; Nawras Riyadh Neamah | ||
Abstract | ||
Missing values in data sets represent one of the greatest challenge in analyzing data to extract knowledge from the data set. The work in this paper presents a new approach for solving the missing values problems by using and merging two different techniques; clustering (K-means and Expectation Maximization) and curve fitting. More than twenty thousand records of real health data set collected from different Iraqi hospitals were used to create and test the proposed approach that showed better results than the most popular techniques for estimation missing values such as most common values, overall overage, class average, and class most common values. Different software were used in the proposed work including WEKA (Waikato Environment for Knowledge Analysis), Matlab, Excel and C++. | ||
Keywords | ||
Data Mining; Missing Values; clustering; Curve Fitting | ||
Statistics Article View: 70 PDF Download: 23 |