Open Access   Article Go Back

Crawling Hidden Objects with KNN Queries

P. Krithika1 , G. Sowmiya2

Section:Survey Paper, Product Type: Journal Paper
Volume-07 , Issue-04 , Page no. 307-309, Feb-2019

Online published on Feb 28, 2019

Copyright © P. Krithika, G. Sowmiya . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: P. Krithika, G. Sowmiya, “Crawling Hidden Objects with KNN Queries,” International Journal of Computer Sciences and Engineering, Vol.07, Issue.04, pp.307-309, 2019.

MLA Style Citation: P. Krithika, G. Sowmiya "Crawling Hidden Objects with KNN Queries." International Journal of Computer Sciences and Engineering 07.04 (2019): 307-309.

APA Style Citation: P. Krithika, G. Sowmiya, (2019). Crawling Hidden Objects with KNN Queries. International Journal of Computer Sciences and Engineering, 07(04), 307-309.

BibTex Style Citation:
@article{Krithika_2019,
author = {P. Krithika, G. Sowmiya},
title = {Crawling Hidden Objects with KNN Queries},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2019},
volume = {07},
Issue = {04},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {307-309},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=777},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=777
TI - Crawling Hidden Objects with KNN Queries
T2 - International Journal of Computer Sciences and Engineering
AU - P. Krithika, G. Sowmiya
PY - 2019
DA - 2019/02/28
PB - IJCSE, Indore, INDIA
SP - 307-309
IS - 04
VL - 07
SN - 2347-2693
ER -

           

Abstract

Many websites offering Location Based Services (LBS) provide a k NN search interface that returns the top-k nearest-neighbor objects (e.g., nearest restaurants) for a given query location. This paper addresses the problem of crawling all objects efficiently from an LBS website, through the public k NN web search interface it provides. Specifically, we develop crawling algorithm for 2D and higher-dimensional spaces, respectively, and demonstrate through theoretical analysis that the overhead of our algorithms can be bounded by a function of the number of dimensions and the number of crawled objects, regardless of the underlying distributions of the objects. We also extend the algorithms to leverage scenarios where certain auxiliary information about the underlying data distribution, e.g., the population density of an area which is often positively correlated with the density of LBS objects, is available. Extensive experiments on real-world datasets demonstrate the superiority of our algorithms over the state-of-the-art competitors in the literature.

Key-Words / Index Term

Knn, Hidden Objects, Location Based Services, Crawling, Density

References

[1] Mcdonalds, “Mcdonalds page, http://www.mcdonalds.com/,” [Accessed: Aug. 6, 2014]. [Online]. Available: url{http://www.mcdonalds.com/us/en/restaurant locator.html}
[2] S. Byers, J. Freire, and C. T. Silva, “Efficient acquisition of web data through restricted query interfaces,” in Poster Proceedingsof the Tenth International World Wide Web Conference, WWW 10,Hong Kong, China, May 1-5, 2001, 2001. [Online].Available:http://www10.org/cdrom/posters/1051.pdf
[3] W. D. Bae, S. Alkobaisi, S. H. Kim, S. Narayanappa, and C. Shahabi, “Web data retrieval: solving spatial range queries using k-nearest neighbor searches,” Geoinformatica, vol. 13, no. 4, pp. 483–514, 2009.
[4] G. E. Glasses, “Great eye glasses page, http://www.greateyeglasses.com/shop/search.php,” [Ac cessed: Jan. 20, 2014]. [Online]. Available: url{http://www.greateyeglasses.com/shop/search.php}
[5] Yahoo, “Yahoo local page, https://local.yahoo.com/,”[Accessed: Dec. 2012]. [Online]. Available: url{https://local.yahoo.com/}
[6] U. Census, “Us census, http://www.census.gov/cgibin/geo/shapefiles2013/layers.cgi,” [Accessed: Dec. 2013].[Online]. Available: url{http://www.census.gov/cgi-bin/geo/shapefiles2013/layers.cgi}
[7] L. Devroye, “Sample-based non-uniform random variate generation,” in Proceedings of the 18th conference on Winter simulation. ACM, 1986, pp. 260–265.
[8] L. Barbosa and J. Freire, “Siphoning hidden-web data throughkeyword-based interfaces,” in SBBD, 2004, pp. 309–321.
[9] A. Ntoulas, P. Pzerfos, and J. Cho, “Downloading textualhidden web content through keyword queries,” in DigitalLibraries, 2005.JCDL’05.Proceedings of the 5th ACM/IEEE-CSJoint Conference on. IEEE, 2005, pp. 100–109.
[10] K. Vieira, L. Barbosa, J. Freire, and A. Silva, “Siphon++: ahidden-webcrawler for keyword-based interfaces,” in Proceedings of the 17th ACM conference on Information and knowledgemanagement. ACM, 2008, pp. 1361–1362.
[11] L. Jiang, Z. Wu, Q. Feng, J. Liu, and Q. Zheng, “Efficient deep web crawling using reinforcement learning,” in Advances inKnowledge Discovery and Data Mining. Springer, 2010, pp. 428–439.
[12] S. Raghavan and H. Garcia-Molina, “Crawling the hidden web,” in VLDB 2001, Proceedings of 27th InternationalConference on Very Large Data Bases, September 11-14, 2001,Roma, Italy, 2001, pp. 129–138. [Online]. Available: http://www.vldb.org/conf/2001/P129.pdf
[13] S. W. Liddle, D. W. Embley, D. T. Scott, and S. H. Yau, “Extracting data behind web forms,” in ConceptualModeling - ER 2002, 21st International Conference on ConceptualModeling, Tampere, Finland, October 7-11, 2002, Proceedings,
2002, pp. 402–413. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-45275-1 35
[14] P. Wu, J. Wen, H. Liu, and W. Ma, “Query selection techniques for efficient crawling of structured web sources,” in Proceedingsof the 22nd International Conference on Data Engineering, ICDE2006, 3-8 April 2006, Atlanta, GA, USA, 2006, p. 47. [Online]. Available: http://dx.doi.org/10.1109/ICDE.2006.124
[15] M. Alvarez, J. Raposo, A. Pan, F. Cacheda, F. Bellas, and ´ V. Carneiro, “Crawling the content hidden behind web forms,” in Computational Science and Its Applications–ICCSA 2007. Springer, 2007, pp. 322–333.