Harvesting the Resources of Invisible Web

Hardeep Singh and Geet Bawa

Open Access Article Go Back

Harvesting the Resources of Invisible Web

Hardeep Singh¹ , Geet Bawa²

Section:Review Paper, Product Type: Journal Paper
Volume-3 , Issue-11 , Page no. 28-32, Nov-2015

Online published on Nov 30, 2015

Copyright © Hardeep Singh , Geet Bawa . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Hardeep Singh , Geet Bawa, “Harvesting the Resources of Invisible Web,” International Journal of Computer Sciences and Engineering, Vol.3, Issue.11, pp.28-32, 2015.

MLA Style Citation: Hardeep Singh , Geet Bawa "Harvesting the Resources of Invisible Web." International Journal of Computer Sciences and Engineering 3.11 (2015): 28-32.

APA Style Citation: Hardeep Singh , Geet Bawa, (2015). Harvesting the Resources of Invisible Web. International Journal of Computer Sciences and Engineering, 3(11), 28-32.

BibTex Style Citation:
@article{Singh_2015,
author = {Hardeep Singh , Geet Bawa},
title = {Harvesting the Resources of Invisible Web},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {11 2015},
volume = {3},
Issue = {11},
month = {11},
year = {2015},
issn = {2347-2693},
pages = {28-32},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=721},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=721
TI - Harvesting the Resources of Invisible Web
T2 - International Journal of Computer Sciences and Engineering
AU - Hardeep Singh , Geet Bawa
PY - 2015
DA - 2015/11/30
PB - IJCSE, Indore, INDIA
SP - 28-32
IS - 11
VL - 3
SN - 2347-2693
ER -

VIEWS	PDF	XML
2680	2533 downloads	2383 downloads

Bar Line

Abstract

The World Wide Web is constantly becoming an important part of social, cultural, political, educational, academic, and commercial life. Web contains a wide range of information and applications in areas that are of societal interest. A great number of World Wide Web users use search engines for information retrieval, but still hesitate before making a final decision, often because only rough and limited information about the products is made available. There are millions of high quality resources available on web that the general-purpose search engines can’t see. One of the supportive reasons for this could be use of irrelevant keyword(s) or choice of a wrong search engines for executing a particular request of the searcher. Many times search engine cannot find out what we exactly wanted from it. The major reason why sometimes we do not succeed to acquire efficient results, other than these reasons, is the technical inability of search engines to access and retrieve some of the contents present on the web. That is, some of the information is hidden from the eyes of even efficient search engines. Such information which remains inaccessible from web search engines is termed as “Invisible Web”. Invisible Web contains resources that are not indexed by general-purpose search engines, but this does not indicate that these resources are absolute leftovers and unimportant. The information that is not accessed by a search engine is as much significant as that which is accessed. Invisible web is a phenomenon to be reckoned with. This paper provides a view of Invisible Web and also delves into the reasons why search engines can’t see all of the web contents. Various resources present in invisible web are also discussed. Paper also provides a list of search engines that could mine and harvest Invisible Web.

Key-Words / Index Term

Search Engines; Invisible Web; Surface Web; Internet Portals.

References

[1] Jacsó, P. (2005), "Google Scholar: the pros and cons", Online Information Review, Vol. 29, No. 2, pp. 208-214.
[2] CompletePlanet. (2004). “Largest deep web sites”. BrightPlanet. Available: http://aip. completeplanet.com/aip-engines/help/largest_engines.jsp
[3] Devine, Jane, and Francine Egger-Sider. 2001. Beyond Google: The Invisible Web. Available: www.lagcc.cuny.edu/library/invisibleweb/definition.htm
[4] Bergman, Michael K. (2001). “The deep Web: Surfacing hidden value.” White paper. BrightPlanet. Available: www.brightplanet.com/images/stories/pdf/deepwebwhite paper. pdf
[5] Sullivan, Danny. (2008). “Google now fills out forms and crawls results.” Search Engine Land. Available: http://searchengineland.com/080411-140000.php
[6] Williams, M.E. (2005), "The state of databases today: 2005", in Gale Directory of Databases, Vol. 2, pp. XV-XXV, Gale Group, Detroit, MI.
[7] Ru, Y. and Horowitz, E. (2005), "Indexing the invisible web: a survey", Online Information Review, Vol. 29, No. 3, pp. 249-265.
[8] Calishain, Tara. 2005. “Has Google dropped their 101K cache limit?” ResearchBuzz! Available: www.researchbuzz.org/2005/01/has_google_dropped_their_101k.shtml

Citations	2325
h-index	16
i10-index	47