Open Access   Article Go Back

Deep Web Data Scraper: Search Engine

S. NainB1 , H. Lall2

Section:Research Paper, Product Type: Journal Paper
Volume-2 , Issue-5 , Page no. 52-56, May-2014

Online published on May 31, 2014

Copyright © S. NainB, H. Lall . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: S. NainB, H. Lall, “Deep Web Data Scraper: Search Engine,” International Journal of Computer Sciences and Engineering, Vol.2, Issue.5, pp.52-56, 2014.

MLA Style Citation: S. NainB, H. Lall "Deep Web Data Scraper: Search Engine." International Journal of Computer Sciences and Engineering 2.5 (2014): 52-56.

APA Style Citation: S. NainB, H. Lall, (2014). Deep Web Data Scraper: Search Engine. International Journal of Computer Sciences and Engineering, 2(5), 52-56.

BibTex Style Citation:
@article{NainB_2014,
author = {S. NainB, H. Lall},
title = {Deep Web Data Scraper: Search Engine},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2014},
volume = {2},
Issue = {5},
month = {5},
year = {2014},
issn = {2347-2693},
pages = {52-56},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=158},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=158
TI - Deep Web Data Scraper: Search Engine
T2 - International Journal of Computer Sciences and Engineering
AU - S. NainB, H. Lall
PY - 2014
DA - 2014/05/31
PB - IJCSE, Indore, INDIA
SP - 52-56
IS - 5
VL - 2
SN - 2347-2693
ER -

VIEWS PDF XML
3872 3493 downloads 3611 downloads
  
  
           

Abstract

World Wide Web is growing every day and people generally depend on search engine to explore the web. Searching on the web today can be compared to dragging a net across the surface of the ocean. Traditional search engine extracts data from the small portion of the web whereas the large portions of the web are hidden behind search forms, in searchable structured and unstructured database. Deep web contains the high quality content and large coverage area. A lot of research has been carried out in this area to make the hidden data float on the surface of web. In this paper, we discussed the problem faced by users in scraping the information from the deep web and also discussed the solution of these problems by using our new approach

Key-Words / Index Term

Surface Web; Deep Web; Search Engine; Deep Web Search Engine; Crawler; Indexer; Human Powered Directory

References

[1] Bergman,Michael K., �White Paper: The Deep Web: Surfacing Hidden Value� Journal of Electronic Publishing Vol.7,Issue-1,2001.
[2] Ling Liu, James Caverlee, �Deep Web Data Extraction�
[3] Emilio Ferrrara, Giacomo F., Robert B., �Web Data Extraction, Applications and Techniques: A Survey� ACM Transaction on Computational Logic, Vol.5, June 2010, pp.1-20.
[4] Brin, Lawrence Page �The anatomy of large-scale hypertexual Web Search Engine�, Computer Networks and ISDN Systems, Vol.30, 1998, pp.107-111.
[5] Laender, Silva, Juliana S., � A Brief Survey of Web Data Extraction Tools�.
[6] Sriram R., Hector, �Crawling the Hidden Web� in the proceeding of the 27th VLDB Conference, Roma, Italy,2001.
[7] Babita, Anuradha, Ashish, �Hidden Web Data Extraction Tools� International Journal of Computer Applications, Vol.82,2013.
[8] Deep Web Website: //en.wikipedia.org/wiki/Deep_Web
[9] WikipediaWebsite: //en.wikipedia.org/wiki/Web_crawler
[10] Wikipedia Website: //en.wikipedia.org/wiki/Search_Engine
[11] Ntoulas, Zerfos, Junghoo Cho, �Downloading Hidden Web Content�
[12] Anuradha, A. K. Sharma, �Design of Hidden Web Search Engine� International Journal of Computer Application, Vol.30, 2011.
[13] Chez Hong-ping, Fang Wei, Yang Zhou, �Automatic Data Records Extraction from List Page in Deep Web Sources� Vol.6, 2009, pp.370-373.
[14] Chris Sherman, GARY Price, �Hidden web: Uncovering Information Sources Search Engines Can�t See� CyberAge Book, 2001.
[15] Manuel, Juan R., Fidel, Alberto Pan, �A Task specific Approach for Crawling the Deep Web� 2006.
[16] Califf, M. E., and Mooney, R. J., �Relational Learning of Pattern-Match Rules for Information Extraction� In Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence (Orlando, Florida, 1999), pp.328-334.
[17] Crescenzi, V., and Mecca, G., �Grammars Have Exceptions�, Information Systems 23, 8, (1998), pp.539-565.