Open Access   Article Go Back

Efficient Indexing and Searching of Big Data in HDFs

D.Deepika 1 , K.Pugazhmathi 2

Section:Research Paper, Product Type: Journal Paper
Volume-4 , Issue-4 , Page no. 237-243, Apr-2016

Online published on Apr 27, 2016

Copyright © D.Deepika, K.Pugazhmathi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: D.Deepika, K.Pugazhmathi, “Efficient Indexing and Searching of Big Data in HDFs,” International Journal of Computer Sciences and Engineering, Vol.4, Issue.4, pp.237-243, 2016.

MLA Style Citation: D.Deepika, K.Pugazhmathi "Efficient Indexing and Searching of Big Data in HDFs." International Journal of Computer Sciences and Engineering 4.4 (2016): 237-243.

APA Style Citation: D.Deepika, K.Pugazhmathi, (2016). Efficient Indexing and Searching of Big Data in HDFs. International Journal of Computer Sciences and Engineering, 4(4), 237-243.

BibTex Style Citation:
@article{_2016,
author = {D.Deepika, K.Pugazhmathi},
title = {Efficient Indexing and Searching of Big Data in HDFs},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {4 2016},
volume = {4},
Issue = {4},
month = {4},
year = {2016},
issn = {2347-2693},
pages = {237-243},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=927},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=927
TI - Efficient Indexing and Searching of Big Data in HDFs
T2 - International Journal of Computer Sciences and Engineering
AU - D.Deepika, K.Pugazhmathi
PY - 2016
DA - 2016/04/27
PB - IJCSE, Indore, INDIA
SP - 237-243
IS - 4
VL - 4
SN - 2347-2693
ER -

VIEWS PDF XML
1380 1354 downloads 1359 downloads
  
  
           

Abstract

Efficient indexing is an efficient, standard data structure, most suited for look operation over an exhaustive set of data. The enormous set of data is mostly unstructured furthermore, does not fit into traditional database categories. Extensive scale preparing of such data needs a dispersed structure such as Hadoop where computational assets could easily be shared furthermore, accessed. An execution of a look motor in Hadoop over millions of Wikipedia reports utilizing an transformed record data structure would be conveyed out for making look operation more accomplished. Transformed record data structure is utilized for mapping a word in a record or set of records to their relating locations. A hash table is utilized in this data structure which stores each word as record furthermore, their relating areas as its values thereby providing simple lookup furthermore, extremely of data making it suitable for look operations.

Key-Words / Index Term

Hadoop; Enormous Data; Efficient Indexing; Data Structure

References

[1] Raj, A. Kaur, K. ; Dutta, U. ; Sandeep, V.V. ; Rao, S. "Enhancement of Hadoop Clusters with Virtualization Using the Capacity Scheduler", Third International Conference on Services in Emerging Markets (ICSEM),Mysore, India, Dec 2012. Page(s): 50 - 57.
[2] Jiong Xie; Shu Yin ; Xiaojun Ruan ; Zhiyang Ding ; Yun Tian ; Majors, J. ; Manzanares, A. ; Xiao Qin. "Improving MapReduce performance through data placement in heterogeneous Hadoop clusters". IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), Atlanta, GA, April, 2010. Page(s): 1 - 9.
[3] Kala Karun, A ; Chitharanjan, K ; "A review on hadoop — HDFS infrastructure extensions ", IEEE Conference on Information & Communication Technologies (ICT), JeJu Island, April 2013. Page(s): 132 - 137.
[4] Richard Mccreadie ; Craig Macdonald ; Iadh Ounis; "MapReduce indexing strategies: Studying scalability and efficiency". International Journal of Information Processing and Management. Volume 48 Issue 5, September, 2012. Pages: 873-888.
[5] Apache Hadoop, Hadoop, HDFS, Avro, Cassandra, Chukwa, HBase, Hive, Mahout, Pig, Zookeeper are trademarks of the Apache Software Foundation. http://www.hadoop.apache.org/ Last Published: 10/16/2013
[6] Barry Wilkinson; Michael Allen; “Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers” (2nd Edition). Publication Date: March 14, 2004,
[7] Gal Lavee ; Ronny Lempel ; Edo Liberty ; Oren Somekh ; " Inverted index compression via online document routing" Published in: WWW '11 Proceedings of the 20th international conference on World Wide Web. Pages 487-496.
[8] Guanghui Xu; Feng Xu; Hongxu Ma; "Deploying and researching Hadoop in virtual machines". Published in: IEEE International Conference on Automation and Logistics (ICAL), Zhengzhou, Aug. 2012. Page(s): 395 - 399.
[9] Shvachko, K.; Hairong Kuang ; Radia, S. ; Chansler, R. " The Hadoop Distributed File System". Published in: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, May 2010. Page(s): 1 - 10.
[10] Ishii, M.; Jungkyu Han; Makino, H; "Design and performance evaluation for Hadoop clusters on virtualized environment" Published in: International Conference on Information Networking (ICOIN), Bangkok, Jan. 2013. Page(s): 244 - 249.