Open Access   Article Go Back

Harvesting the Resources of Invisible Web

Hardeep Singh1 , Geet Bawa2

Section:Review Paper, Product Type: Journal Paper
Volume-3 , Issue-11 , Page no. 28-32, Nov-2015

Online published on Nov 30, 2015

Copyright © Hardeep Singh , Geet Bawa . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Hardeep Singh , Geet Bawa, “Harvesting the Resources of Invisible Web,” International Journal of Computer Sciences and Engineering, Vol.3, Issue.11, pp.28-32, 2015.

MLA Style Citation: Hardeep Singh , Geet Bawa "Harvesting the Resources of Invisible Web." International Journal of Computer Sciences and Engineering 3.11 (2015): 28-32.

APA Style Citation: Hardeep Singh , Geet Bawa, (2015). Harvesting the Resources of Invisible Web. International Journal of Computer Sciences and Engineering, 3(11), 28-32.

BibTex Style Citation:
@article{Singh_2015,
author = {Hardeep Singh , Geet Bawa},
title = {Harvesting the Resources of Invisible Web},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {11 2015},
volume = {3},
Issue = {11},
month = {11},
year = {2015},
issn = {2347-2693},
pages = {28-32},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=721},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=721
TI - Harvesting the Resources of Invisible Web
T2 - International Journal of Computer Sciences and Engineering
AU - Hardeep Singh , Geet Bawa
PY - 2015
DA - 2015/11/30
PB - IJCSE, Indore, INDIA
SP - 28-32
IS - 11
VL - 3
SN - 2347-2693
ER -

VIEWS PDF XML
2680 2533 downloads 2383 downloads
  
  
           

Abstract

The World Wide Web is constantly becoming an important part of social, cultural, political, educational, academic, and commercial life. Web contains a wide range of information and applications in areas that are of societal interest. A great number of World Wide Web users use search engines for information retrieval, but still hesitate before making a final decision, often because only rough and limited information about the products is made available. There are millions of high quality resources available on web that the general-purpose search engines can’t see. One of the supportive reasons for this could be use of irrelevant keyword(s) or choice of a wrong search engines for executing a particular request of the searcher. Many times search engine cannot find out what we exactly wanted from it. The major reason why sometimes we do not succeed to acquire efficient results, other than these reasons, is the technical inability of search engines to access and retrieve some of the contents present on the web. That is, some of the information is hidden from the eyes of even efficient search engines. Such information which remains inaccessible from web search engines is termed as “Invisible Web”. Invisible Web contains resources that are not indexed by general-purpose search engines, but this does not indicate that these resources are absolute leftovers and unimportant. The information that is not accessed by a search engine is as much significant as that which is accessed. Invisible web is a phenomenon to be reckoned with. This paper provides a view of Invisible Web and also delves into the reasons why search engines can’t see all of the web contents. Various resources present in invisible web are also discussed. Paper also provides a list of search engines that could mine and harvest Invisible Web.

Key-Words / Index Term

Search Engines; Invisible Web; Surface Web; Internet Portals.

References

[1] Jacsó, P. (2005), "Google Scholar: the pros and cons", Online Information Review, Vol. 29, No. 2, pp. 208-214.
[2] CompletePlanet. (2004). “Largest deep web sites”. BrightPlanet. Available: http://aip. completeplanet.com/aip-engines/help/largest_engines.jsp
[3] Devine, Jane, and Francine Egger-Sider. 2001. Beyond Google: The Invisible Web. Available: www.lagcc.cuny.edu/library/invisibleweb/definition.htm
[4] Bergman, Michael K. (2001). “The deep Web: Surfacing hidden value.” White paper. BrightPlanet. Available: www.brightplanet.com/images/stories/pdf/deepwebwhite paper. pdf
[5] Sullivan, Danny. (2008). “Google now fills out forms and crawls results.” Search Engine Land. Available: http://searchengineland.com/080411-140000.php
[6] Williams, M.E. (2005), "The state of databases today: 2005", in Gale Directory of Databases, Vol. 2, pp. XV-XXV, Gale Group, Detroit, MI.
[7] Ru, Y. and Horowitz, E. (2005), "Indexing the invisible web: a survey", Online Information Review, Vol. 29, No. 3, pp. 249-265.
[8] Calishain, Tara. 2005. “Has Google dropped their 101K cache limit?” ResearchBuzz! Available: www.researchbuzz.org/2005/01/has_google_dropped_their_101k.shtml