Open Access   Article

An Effective K-means approach for Imbalance data clustering using Precise Reduction Sampling

Shaik.Nagul 1 , R.Kiran Kumar2

1 Department of Computer Science, Krishna University, Machilipatnam, India.
2 Department of Computer Science, Krishna University, Machilipatnam, India.

Correspondence should be addressed to: nagulcse@gmail.com.

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-3 , Page no. 65-70, Mar-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i3.6570

Online published on Mar 30, 2018

Copyright © Shaik.Nagul, R.Kiran Kumar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

Citation

IEEE Style Citation: Shaik.Nagul, R.Kiran Kumar, “An Effective K-means approach for Imbalance data clustering using Precise Reduction Sampling”, International Journal of Computer Sciences and Engineering, Vol.6, Issue.3, pp.65-70, 2018.

MLA Style Citation: Shaik.Nagul, R.Kiran Kumar "An Effective K-means approach for Imbalance data clustering using Precise Reduction Sampling." International Journal of Computer Sciences and Engineering 6.3 (2018): 65-70.

APA Style Citation: Shaik.Nagul, R.Kiran Kumar, (2018). An Effective K-means approach for Imbalance data clustering using Precise Reduction Sampling. International Journal of Computer Sciences and Engineering, 6(3), 65-70.

VIEWS PDF XML
180 215 downloads 32 downloads
  
  
           

Abstract

K-means clustering is one of the top 10 algorithms in the field data mining and knowledge discovery. The uniform effect in the k-means clustering reveals that, the imbalance nature of the data source hampered the performance in terms of efficient knowledge discovery. In this paper, we proposed a novel clustering algorithm known as Precise Reduction Sampling K-means (PRS_K-means) for efficient handling of imbalance data and reducing the uniform effect. The experiments shows that the algorithm can not only give attention to different instances of sub clusters for identify the intrinsic properties of the instances for clustering; and it performs better than K-means in terms of reduction in error rate and has higher accuracy and recall rate for improved performance.

Key-Words / Index Term

Data Mining, Knowledge Discovery, Clustering, K-means, imbalance data, uniform effect, under sampling, PRS_K-means

References

[1] Prateeksha Tomar, Amit Kumar Manjhvar, "Clustering Classification for Diabetic Patients using K-Means and M-Tree prediction model", International Journal of Scientific Research in Multidisciplinary Studies , Vol.3, Issue.6, pp.48-53, 2017
[2] Hui Xiong, Junjie Wu, and Jian Chen,” K-Means Clustering Versus Validation Measures: A Data-Distribution Perspective”, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 39, NO. 2, APRIL 2009.
[3] Abhishek kumar K and Sadhana,: SURVEY ON K-MEANS CLUSTERING ALGORITHM”, International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 04, Issue 4, [April– 2017]
[4] Farhad Pourkamali-Anaraki and Stephen Becker, “Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means”,
[5] Fabon Dzogan, Christophe Marsala, Marie-Jeanne Lesot and Maria Rifqi,” An ellipsoidal K-means for document clustering”, 2012 IEEE 12th International Conference on Data Mining
[6] Kaile Zhou, Shanlin Yang,” Exploring the uniform effect of FCM clustering: A data distribution Perspective”, Knowledge-Based Systems 96 (2016) 76–83
[7] Jaya Rama Krishnaiah VV, Ramchand H Rao K, Satya Prasad R (2012) Entropy Based Mean Clustering: An Enhanced Clustering Approach. J Comput Sci Syst Biol 5: 062-067. doi:10.4172/jcsb.1000091
[8] Hartono, O S Sitompul, Tulus and E B Nababan,: Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem”, IOP Conf. Series: Materials Science and Engineering 288 (2017) 012075
[9] Md. Akmol Hussain, Akbar Sheikh Akbari, Ahmad Ghaffari, “Colour Constancy using K-means Clustering Algorithm”, 2016 9th International Conference on Developments in eSystems Engineering.
[10] Junjie Wu, Hui Xiong and Jian Chen,” Adapting the Right Measures for K-means Clustering”,
[11] Richard Nock and Frank Nielsen,” On Weighting Clustering”, EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 8, AUGUST 2006
[12] Wu.J,”The Uniform Effect of K-means Clustering”, J. Wu, Advances in K-means Clustering, Springer Theses, DOI:10.1007/978-3-642-29807-3_2, © Springer-Verlag Berlin Heidelberg 2012.
[13] HamiltonA. Asuncion D. Newman. (2007). UCI Repository of Machine Learning Database (School of Information and Computer Science, Irvine, CA: Univ. of California [Online]. Available: http://www.ics.uci.edu/∼mlearn/MLRepository.html
[14] Witten, I.H. and Frank, E. (2005) Data Mining: Practical machine learning tools and techniques. 2nd edition Morgan Kaufmann, San Francisco.