Open Access   Article Go Back

A Modified Approach for Missing Values in Data Mining Based on Rough Set Theory, Divided –and-Conquer, Closest Fit Approach Idea

Abhijit Sarkar1 , Kasturi Ghosh2

Section:Research Paper, Product Type: Conference Paper
Volume-03 , Issue-01 , Page no. 51-58, Feb-2015

Online published on Feb 18, 2015

Copyright © Abhijit Sarkar , Kasturi Ghosh . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Abhijit Sarkar , Kasturi Ghosh, “A Modified Approach for Missing Values in Data Mining Based on Rough Set Theory, Divided –and-Conquer, Closest Fit Approach Idea,” International Journal of Computer Sciences and Engineering, Vol.03, Issue.01, pp.51-58, 2015.

MLA Style Citation: Abhijit Sarkar , Kasturi Ghosh "A Modified Approach for Missing Values in Data Mining Based on Rough Set Theory, Divided –and-Conquer, Closest Fit Approach Idea." International Journal of Computer Sciences and Engineering 03.01 (2015): 51-58.

APA Style Citation: Abhijit Sarkar , Kasturi Ghosh, (2015). A Modified Approach for Missing Values in Data Mining Based on Rough Set Theory, Divided –and-Conquer, Closest Fit Approach Idea. International Journal of Computer Sciences and Engineering, 03(01), 51-58.

BibTex Style Citation:
@article{Sarkar_2015,
author = {Abhijit Sarkar , Kasturi Ghosh},
title = {A Modified Approach for Missing Values in Data Mining Based on Rough Set Theory, Divided –and-Conquer, Closest Fit Approach Idea},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2015},
volume = {03},
Issue = {01},
month = {2},
year = {2015},
issn = {2347-2693},
pages = {51-58},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=8},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=8
TI - A Modified Approach for Missing Values in Data Mining Based on Rough Set Theory, Divided –and-Conquer, Closest Fit Approach Idea
T2 - International Journal of Computer Sciences and Engineering
AU - Abhijit Sarkar , Kasturi Ghosh
PY - 2015
DA - 2015/02/18
PB - IJCSE, Indore, INDIA
SP - 51-58
IS - 01
VL - 03
SN - 2347-2693
ER -

           

Abstract

Missing data plays a key role in practical fields. How to remove this gap is the main objective of data preprocessing step in data-mining. Many methods such as Statistical and Prediction approaches are generally used for missing data analysis, but unfortunately both approaches have some disadvantages and applicable for serial missing values in column. This paper tries to remove these gaps which are resulting from the two mentioned methods. The proposed algorithm tries to merge up two previously mentioned methods. This modified approach utilizes the potential knowledge and laws suggested by the data in Information System, and some basic mathematical concepts and some concepts from Rough Set Theory. Experimental results show that the proposed algorithm provides better result than the above mentioned two methods.

Key-Words / Index Term

Data mining; missing data; Data preprocessing; Statistical methods; Prediction methods; Rough Set Theory,Serially

References

[1] Zaimei Zhang, Renefa Li, Zhongsheng Li,Haiyan Zhang,Gungaxue Yue. “An incomplete Data Analysis approach Based on Rough St Theory and Divide-and-Conquer Idea”, Fourth Int’ Conf On Fuzzy Systems and Knowledge Discovery(FSKD 2007).
[2] Sanjay Gaur and M.S. Dulawat “A Closest Fit Approach to Missing Attribute Values in Data Mining”, International Journal of Advances in Sciences and Technology Vol.2,No.4,2011 .
[3] Weihua Zhou,Wei Zhang,Yunique Fu.”An Incomplete data analysis approach using rough set theory”, Intelligent Mechatronics and animation.2004,pp.332-338.
[4] Stenfanowski J,Tsoukias A. “On the Extension of Rough Sets Under Incomplete Information”. S Zhong, A Skorown, S Ohsuga (Eds).In: Proc. Of the 7th Int’l Workshop on New Directions in Rough Sets, Data Mining, and Granular Soft Computing.Berlin:Springer-verlag,1999,pp.73-81
[5] Jerzy W,Grzymal-Busse,Ming Hu. “A comparison of several approaches to missing attribute values in data mining”. In: Proc of the 2nd Int’ Conf On Rough Sets and Currents Trends in Computing.Berlin:Springer-Verlag,2000,pp.378-385.
[6] Cios K J.Kurgan L. A. “Trends in data mining and Knowledge
Discovery”. In: Knowledge discovery in advanced information systems,Pal,N.R., Jain,L.C., Teodereresku N.eds.Spinger,2002.
[7] Kryszkiewiez M. “Rough set approach to incomplete information Systems”. Information Sciences,1998, 112,39-49
[8] Pawlak Z. “Rough Sets”. International Journal of Computer and Information Sciences,1982,11(5),pp.341-356.
[9] Symth, P., “ Data mining at the interface of computer Science and Statistics”, Data mining for Scientific and engineering applications, Department of Information and Computer Science, University of California,CA,92697-3425,Chapter 1,pp.1-20,2001.
[10] Zhang,S., Zhang,C., and Young,Q., “Data Preparation for data mining”. Applied Artificial Intelligence,Vol.17,pp.375-381,2003.
[11] Clark, P., and Niblett ,T., “The CN2 induction algorithm”, Machine Learning, Vol. 3,pp.261-283,1983.
[12] Konoenko ,I., Bratko, I, and Roskar,E., “ Experiments in automatic learning of medical diagnostic rules”, Technical Report, Jozef Stefan Institute,LIjubal-jans,Yugoslavia,1984.
[13] http://www.ics.uci.edu/-mlearn/MLRespository.html.