Open Access   Article Go Back

A Speculative Study on Hadoop Scheduling Algorithms

Vanika 1 , Aman Kumar Sharma2

Section:Review Paper, Product Type: Journal Paper
Volume-6 , Issue-6 , Page no. 1171-1176, Jun-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i6.11711176

Online published on Jun 30, 2018

Copyright © Vanika, Aman Kumar Sharma . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Vanika, Aman Kumar Sharma, “A Speculative Study on Hadoop Scheduling Algorithms,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.6, pp.1171-1176, 2018.

MLA Style Citation: Vanika, Aman Kumar Sharma "A Speculative Study on Hadoop Scheduling Algorithms." International Journal of Computer Sciences and Engineering 6.6 (2018): 1171-1176.

APA Style Citation: Vanika, Aman Kumar Sharma, (2018). A Speculative Study on Hadoop Scheduling Algorithms. International Journal of Computer Sciences and Engineering, 6(6), 1171-1176.

BibTex Style Citation:
@article{Sharma_2018,
author = {Vanika, Aman Kumar Sharma},
title = {A Speculative Study on Hadoop Scheduling Algorithms},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {6 2018},
volume = {6},
Issue = {6},
month = {6},
year = {2018},
issn = {2347-2693},
pages = {1171-1176},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2321},
doi = {https://doi.org/10.26438/ijcse/v6i6.11711176}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i6.11711176}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2321
TI - A Speculative Study on Hadoop Scheduling Algorithms
T2 - International Journal of Computer Sciences and Engineering
AU - Vanika, Aman Kumar Sharma
PY - 2018
DA - 2018/06/30
PB - IJCSE, Indore, INDIA
SP - 1171-1176
IS - 6
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
398 219 downloads 138 downloads
  
  
           

Abstract

Big Data is a term which mainly focuses on the use of techniques to capture, process, analyze and visualize large datasets in a reasonable time span. Different platforms, tools and software used for this purpose are known as “Big Data technologies”. Hadoop is an open-source framework used to process large amount of data in an inexpensive and efficient way by using MapReduce which is used for processing and generating large data sets with a parallel, distributed algorithm on a cluster. Job scheduling is a key factor for achieving high performance in big data processing. The paper presents a comparative study of job scheduling algorithms in Hadoop environment. In addition, this paper describes the features, advantages and disadvantages of various Hadoop scheduling algorithms such as FIFO, Fair, Capacity, LATE, Energy-aware, Resource-aware, Matchmaking, Delay and Deadline Constraints.

Key-Words / Index Term

Big Data, Hadoop, Mapreduce, Distributed Systems, Hadoop Scheduling Algorithms

References

[1] O’Reilly Media, “Big Data Now”, O’Reilly media, Inc., 2011.
[2] Rakesh. S. Srisath, Vaibhav A. Desale, Amol. D. Potgantwar, “Big Data Analytical Architecture for Real-Time Applications”, IJSRNSC, Vol. 5, Issue 4, 2017.
[3] K. Parimala, G. Rajkumar, A. Ruba, S. Vijaylashmi, “Challenges and Opportunities with Big Data”, International Journal of Scientific Research in Computer Science and Engineering, Vol. 5, Issue 5, pp. 16-20, 2017.
[4] Jason Venner, “Pro hadoop”, Apress, ISBN 978-1-4302-1943-9, 2009.
[5] Jared Dean, “Big data, data mining, and machine learning: value creation for business leaders and practitioners”, John Wiley & Sons, 2014.
[6] DT Editorial Services, “Big Data”, Black Book, Dreamtech Press, ISBN 978-93-5119-931-1, 2016.
[7] Dean, Jeffrey, Sanjay Ghemawat, "MapReduce: simplified data processing on large clusters", Communications of the ACM, Vol. 51, Issue 1, pp. 107-113, 2008.
[8] Jens Dittrich, Jorge-Arnulfo and Quiane-Ruiz, “Efficient big data processing in Hadoop MapReduce”, Proceedings of the VLDB Endowment, Vol. 5, No. 12, pp. 419-429, 2012.
[9] Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, “The Google file system”, ACM SIGOPS operating systems review, Vol. 37, Issue 5, pp. 20-43, 2003.
[10] J.V. Gautam, H.B. Prajapati, V.K. Dabhi, S. Chaudhary, “A survey on job scheduling algorithms in big data processing”, IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT’15), Coimbatore, pp. 1–11, 2015.
[11] Jiong Xie, FanJun Meng, HaiLong Wang, HongFang Pan, JinHong Ceng, Xiao Qin, “Research on scheduling scheme for hadoop clusters”, Procedia Comput. Sci., Vol. 18, pp. 468-471, 2013.
[12] S. Suresh, N.P. Gopalan, “An optimal task selection scheme for hadoop scheduling”, IERI Procedia, Vol. 10, pp. 70-75, 2014.
[13] Lisia S. Dias, Marianthi.G. Ierapetritou, “Integration of scheduling and control under uncertainties: review and challenges”, Vol. 116, pp. 98-113, 2016.
[14] S. Divya, R. Kanya Rajesh, Rini Mary Nithila I, Vinothini M, “Big data analysis and its scheduling policy - Hadoop”, IOSR J. Computer Engineering (IOSR-JCE), Vol. 17, Issue 1, pp. 36-40, 2015.
[15] B.P Andrews, A. Binu, “Survey on Job Schedulers in Hadoop Cluster”, IOSR Journal of Computer Engineering, pp. 46-50, 2013.
[16] M. Pastorelli, A. Barbuzzi, D. Carra, M. Dell’Amico and P. Michiardi, “Practical size based scheduling for MapReduce workloads”, IEEE International Conference on Big Data, pp. 51-59, 2013.
[17] J.C. Anjos, I. Carrera, W. Kolberg, A.L. Tibola, L.B. Arantes, C.R. Geyer, MRA, “Scheduling and data placement on MapReduce for heterogeneous environments”, Future Generation Computer System, Vol. 42, pp. 22–35, 2015.
[18] Willis Lang, Jignesh M. Patel, “Energy Management for MapReduce Clusters”, Department of Computer Sciences, University of Wisconsin Madison, USA, 2010.
[19] M. Yong, Shiwali Mohan, Nitin Garegrat, "Towards a resource aware scheduler in hadoop", ICWS, pp. 102-109, 2009.
[20] Dazhao Cheng, Jia Rao, Changjun Jiang, Xiaobo Zhou, “Resource and deadline aware job scheduling in dynamic hadoop clusters”, IEEE 29th International Parallel and Distributed Processing Symposium, ISBN 978-1-4799-8649-1, 2015.
[21] C. He, Ying Lu, David Swanson, "Matchmaking: A new MapReduce scheduling technique", IEEE 3rd International Conference, Athens, Greece, 2011.
[22] Qiaomin Xie, Mayank Pundir, Yi Lu, Cristina L. Abad, Roy H. Campbell, Pandas, “Robust locality-aware scheduling with stochastic delay optimality”, IEEE/ACM Trans. Netw., Vol. 25, Issue 2, pp. 662-675, 2016.
[23] K . Kc, K . Anyanwu, "Scheduling hadoop jobs to meet deadlines”, IEEE 2nd Int. Conf. IEEE, Indianapolis, USA, pp. 388-392, 2010.
[24] Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, Ion Stoica, “Improving MapReduce performance in heterogeneous environments”, 8th USENIX conference on Operating systems design and implementation, pp. 29-42, 2008.
[25] Mohd Usama, Mengchen Liu, Min Chen, “Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs”, Digital Communications and Networks, Vol. 3, Issue 4, pp. 260–273, 2017.