Open Access   Article Go Back

A Mapreduce Approach To Deal with Big Data Pre Processing And Classification Problems Based On Evolutionary Algorithms

M.S. Saranya1 , N. Jayaveeran2

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-8 , Page no. 725-730, Aug-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i8.725730

Online published on Aug 31, 2018

Copyright © M.S. Saranya, N. Jayaveeran . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: M.S. Saranya, N. Jayaveeran, “A Mapreduce Approach To Deal with Big Data Pre Processing And Classification Problems Based On Evolutionary Algorithms,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.8, pp.725-730, 2018.

MLA Style Citation: M.S. Saranya, N. Jayaveeran "A Mapreduce Approach To Deal with Big Data Pre Processing And Classification Problems Based On Evolutionary Algorithms." International Journal of Computer Sciences and Engineering 6.8 (2018): 725-730.

APA Style Citation: M.S. Saranya, N. Jayaveeran, (2018). A Mapreduce Approach To Deal with Big Data Pre Processing And Classification Problems Based On Evolutionary Algorithms. International Journal of Computer Sciences and Engineering, 6(8), 725-730.

BibTex Style Citation:
@article{Saranya_2018,
author = {M.S. Saranya, N. Jayaveeran},
title = {A Mapreduce Approach To Deal with Big Data Pre Processing And Classification Problems Based On Evolutionary Algorithms},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {8 2018},
volume = {6},
Issue = {8},
month = {8},
year = {2018},
issn = {2347-2693},
pages = {725-730},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2761},
doi = {https://doi.org/10.26438/ijcse/v6i8.725730}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i8.725730}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2761
TI - A Mapreduce Approach To Deal with Big Data Pre Processing And Classification Problems Based On Evolutionary Algorithms
T2 - International Journal of Computer Sciences and Engineering
AU - M.S. Saranya, N. Jayaveeran
PY - 2018
DA - 2018/08/31
PB - IJCSE, Indore, INDIA
SP - 725-730
IS - 8
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
372 322 downloads 171 downloads
  
  
           

Abstract

The big data is a term which is used to describe the exponential growth in data that has occurred recently and it also represents an immense challenge for traditional learning techniques. In order to deal with big data pre processing and classification problems, a novel MapReduce-Neuro Ant Colony (MR-NAC) algorithm was proposed. The proposed algorithm used MapReduce framework to pre process and classify the large dataset which is found to difficult without using the MapReduce framework. The experimentation for the proposed work is carried on two different datasets and results obtained are discussed. The obtained results are much satisfactory which supports the proposed novel algorithm for big data pre processing and classification. AUC and execution time are the two metrics which were used to measure the performance of the proposed MR-NAC Algorithm.

Key-Words / Index Term

Big Data, Map Reduce, Neural Network, Ant Colony, Pre process, Classification, execution time

References

[1] E. Alpaydin, “Introduction to Machine Learning”, MIT Press, Cambridge Mass, USA, 2ND Edition, 2010.
[2] E. Merelli, M. Pettini and M. Rasetti, “Topology driven modelling: the IS metaphor”, Natural Computing, Vol. 14, Issue 3, pp 421-430, 2015.
[3] Prakash Singh , "Efficient Deep Learning for Big Data: A Review", International Journal of Scientific Research in Computer Science and Engineering, Vol.4, Issue.6, pp.36-41, 2016.
[4] A. Fern´andez, S. del R´ıo,V.L´opez, “Big data with cloud computing: an insight on the computing environment, MapReduce, and programming frameworks,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 4, Issue 5, pp.380–409,2014.
[5] S. Sakr, A. Liu, D. M. Batista, and M. Alomari, “A survey of large scale data management approaches in cloud environments,” IEEE Communications Surveys and Tutorials, Vol.13,Issue.3, pp.311–336, 2011.
[6] Bacardit and X. Llor`a, “Large-scale data mining using genetics-based machine learning,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 3, Issue.1, pp.37–61,2013.
[7] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM,Vol.51, Issue.1, pp. 107–113, 2008.
[8] J. Dean and S. Ghemawat, “Map reduce: a flexible data processing tool,” Communications of the ACM,Vol.53, Issue.1,pp.72–77, 2010.
[9] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The google file system,” In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03), pp. 29–43, October 2003.
[10] M. Snir and S. Otto, “MPI—The Complete Reference: The MPI Core”, MIT Press, Boston, Mass, USA, 1998.
[11] W. Zhao, H. Ma, and Q. He, “Parallel k-means clustering based on MapReduce, In Cloud Computing, M. Jaatun, G. Zhao, and C. Rong, Eds., Vol. 5931 of Lecture Notes in Computer Science, pp. 674–679, Springer, Berlin, Germany, 2009.
[12] A. Srinivasan, T. A. Faruquie, and S. Joshi, “Data and task parallelism in ILP using MapReduce,” Machine Learning, Vol.86, Issue.1, pp.141–168, 2012.
[13] H. He, E.A. Garcia, “Learning from imbalanced data”, IEEE Transaction of Knowledge Enginnering, Vol. 21, Issue. 9, pp 1263-1284, 2009.
[14] Y. Sun, A.K.C. Wong, M.S. Kamel, “Classification of imbalanced data: a review”, International Journal of Pattern Recognition and Artificial Intelligence, Vol 23, Issue 4, pp 687-719, 2009.
[15] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, Vol.51, Issue.1, pp. 107–113, 2008.
[16] J. Dean and S. Ghemawat, “Map reduce: a flexible data processing tool,” Communications of the ACM, Vol.53, Issue.1, pp.72–77, 2010.
[17] Daniel Peralta, Sara del Río,Sergio Ramírez-Gallego, Isaac Triguero, Jose M. Benitez, and Francisco Herrera, “Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach”, Hindawi Publishing Corporation, Mathematical Problems in Engineering, Vol 2015, pp,. 1-11, 2015.
[18] Sara del Río , Victoria López, José Manuel Benítez, Francisco Herrera, “On the use of MapReduce for imbalanced big data using Random Forest”, Information Sciences, Vol 285, pp 112–137, 2014.
[19] A. Yadav, V.K. Harit, "Fault Identification in Sub-Station by Using Neuro-Fuzzy Technique", International Journal of Scientific Research in Computer Science and Engineering, Vol.4, Issue.6, pp.1-7, 2016