International Journal of
Computer Sciences and Engineering

Scholarly Peer-Reviewed Scientific Research Publishing Journal
A study on performance of UCI Hungarian dataset using missing value management techniques
A study on performance of UCI Hungarian dataset using missing value management techniques
R. MISIR1* , R.K. SAMANTA2

Section:Review Paper, Product Type: Journal Paper
Volume-5 , Issue-3 , Page no. 40-44, Mar-2017
Online published on Mar 25, 2017

Copyright © R. MISIR, R.K. SAMANTA . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
 
  XML View PDF Download  
Citation

IEEE Style Citation: R. MISIR, R.K. SAMANTA, “A study on performance of UCI Hungarian dataset using missing value management techniques”, International Journal of Computer Sciences and Engineering, Vol.5(3), pp.40-44, 2017.

MLA Style Citation: R. MISIR, R.K. SAMANTA "A study on performance of UCI Hungarian dataset using missing value management techniques." International Journal of Computer Sciences and Engineering 5.3 (2017): 40-44.

APA Style Citation: R. MISIR, R.K. SAMANTA, (2017). A study on performance of UCI Hungarian dataset using missing value management techniques. International Journal of Computer Sciences and Engineering, 5(3), 40-44.
Abstract :
This is to presents a study on performance of UCI Hungarian data sets using missing value management techniques. We used bootstrap algorithm with multiple imputation (MI), LOCF, Mean–Mode substitution and IV-for missingness on the reduct file of the dataset to use all 294 instances in the dataset for our experimental input. Five imputed files were generated from the original reduct file in MI technique where from we have taken the average result and created other input files as per requirements for each specified technique, which are studied using two most recognized but opposite in nature approaches for classification, viz. IBPLN and BBP among many of such learning algorithms in the literature [10], but the most well-known among them are back propagation [11], [12], ART [13], and RBF networks [14]. Accuracy for test cases of five imputed files varies from 89.79% to 99.00% by CCR measure, the most recognized benchmarking parameter for judging classification result and performance of the dataset.
Key-Words / Index Term :
Hungarian data sets, CARN, Amelia View, R Statistical platform, Boot strapping, Multiple imputation, LOCF, Mean–Mode substitution, IV-for missingness, online incremental back propagation, Batch back propagation, CCR.
References :
[1] Rajesh Misir, R. K. Samanta, "Prediction of Heart Disease and Performances of Data Sets", Proceedings of Third International Conference on Computing and Systems 2016, pp. 7-10, January 21st-22nd, ISBN:978-93-85777-13-4.
[2] Rajesh Misir, Malay Mitra and R. K. Samanta, "A Study on Bench marking Parameters for Intelligent Systems", National Conference on Computational Technologies-2015,International Journal of Computer Science and Engineering, Volume-3, Special Issue-1. E-ISSN:2347-2693.
[3] R. Das et al., :Effective diagnosis of heart disease through neural network ensembles. Expert Systems with Applications, 369, 7675-7680 ( 2009).
[4] N. Cheung: Machine learning techniques for medical diagnosis. School of Information Technology and Electrical Engineering. B.Sc. Thesis, University of Queensland (2001).
[5] K. Polat et al.,: A new classification method to diagnosis heart disease: supervised artificial immune system (AIRS). Proc. of the Turkish Symposium on Artificial Intelligence and Networks (2005).
[6] Lecture Notes in Electrical Engineering 326, Springer India ( 2015).
[7] K. B. Nahato et al., : Knowledge mining from clinical datasets using rough sets and backpropagation neural network. Computational and Mathematical Methods in Medicine, Article ID 460189 (2015).
[8] W. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bulletin of Mathematical Biophysics, vol. 7, 115-133( 1943).
[9] D. O. Hebb, The Organization of Behavior, a Neuropsychological Theory, New York, John Wiley, (1949).
[10] A. Roy, Artificial Neural Networks- A Science in Trouble, SIGKDD Explorations, vol. 1, issue 2, 33-38,(2000).
[11] D. E. Rumelhart, J. L. McClelland ( eds.), Parallel Distributed Processing: Explorations in Microstructures of Cognition, vol. 1: Foundations, MIT Press, Cambridge, M.A., 318-362, (1986).
[12] D. E. Rumelhart, The Architecture of Mind: A Connectionist Approach, Chapter 8 in J. Haugeland (ed.), Mind_design II, 1997, MIT Press, 205-232,(1986).
[13] S. Grossberg, Nonlinear Neural Networks: Principles, Mechanisms, and Architectures, Neural Networks, vol. 1 , 17 -61, (1988).
[14] J. Moody and C. Darken, Fast Learning in Networks of Locally-Tuned Processing Units, Neural Computation, vol. 1, 281-294, (1989).
[15] L. Fu, H. Hsu, and J. C. Principe, Incremental Backpropagation Learning Networks, IEEE Trans. on Neural Networks, vol. 7, no.3, 757-761, (1996).
[16] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal representation by error propagation, in Parallel Distributed Processing: Explorations in the Microstructures of Cognition, MA, MIT Press, vol. 1. (1986).
[17] M. S. Hung, M. Shankar, M. Y. Hu, Estimating Breast Cancer Risks Using Neural Networks, J. Operational Research Society, Vol. 52, 1-10 (2001).
[18] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximator, Neural Network, Vol.2, 359-366 (1991).
[19] D. Goa, On structures of supervised linear basis function feedforward three-layered neural networks, Chin. J. Comput., vol. 21, no. 1, 80-86 (1998).
[20] M. L. Huang, Y. H. Hung, and W. Y. Chen, Neural network classifier with entropy based feature selection on breast cancer diagnosis, J Med Syst, vol. 34, no. 5, 865-873( 2010).