Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm

A.A. Kushwah, Y.C. Kulkarni

Open Access Article Go Back

Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm

A.A. Kushwah¹ , Y.C. Kulkarni²

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-6 , Page no. 442-450, Jun-2018

CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i6.442450

Online published on Jun 30, 2018

Copyright © A.A. Kushwah, Y.C. Kulkarni . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: A.A. Kushwah, Y.C. Kulkarni, “Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.6, pp.442-450, 2018.

MLA Style Citation: A.A. Kushwah, Y.C. Kulkarni "Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm." International Journal of Computer Sciences and Engineering 6.6 (2018): 442-450.

APA Style Citation: A.A. Kushwah, Y.C. Kulkarni, (2018). Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm. International Journal of Computer Sciences and Engineering, 6(6), 442-450.

BibTex Style Citation:
@article{Kushwah_2018,
author = { A.A. Kushwah, Y.C. Kulkarni},
title = {Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {6 2018},
volume = {6},
Issue = {6},
month = {6},
year = {2018},
issn = {2347-2693},
pages = {442-450},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2203},
doi = {https://doi.org/10.26438/ijcse/v6i6.442450}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i6.442450}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2203
TI - Identifying Competitors from Large Unstructured Dataset Using Naïve Bayes Classifier and Apriori Algorithm
T2 - International Journal of Computer Sciences and Engineering
AU - A.A. Kushwah, Y.C. Kulkarni
PY - 2018
DA - 2018/06/30
PB - IJCSE, Indore, INDIA
SP - 442-450
IS - 6
VL - 6
SN - 2347-2693
ER -

VIEWS	PDF	XML
465	345 downloads	314 downloads

Bar Line

Abstract

Along line of research has shown the vital significance of recognizing and observing company’s contestants. In the framework of this activity various questions are emerge like: In what way we justify and measure the competitiveness between two items? Who are the most important competitors of a specified item? What are the various features of an item that act on competitiveness? Inspired by this issue, the advertising and administration group have concentrated on observational strategies for competitor distinguishing proof and in addition on techniques for examining known contenders. Surviving examination on the previous has concentrated on mining near articulations (e.g.one product is superior then other product) from the web or other documentary sources. Despite the fact that such articulations can without a doubt be indications of strength, they are truant in numerous spaces. By surveying the various papers, we found the conclusion of basic significance of the competitiveness between two items on the basis of market sectors. In this paper, we state novel description of the competitiveness between two items, based on the market sector. This system estimation of competitiveness uses customer reviews of different domains, a plentiful source of information. This system shows an efficient approach for evaluating competitiveness in large review datasets and finding the top-k competitors. Our experiments are based on a corpus of Yelp.in, TripAdvisor.com, and Amazon customer reviews which states that the proposed methodology can extract comparative relations more precisely. In this paper, we state an efficient framework for the classification of reviews of mainstream domain using k-means clustering and Naïve Bayes algorithm. This system evaluates the competitiveness of two items from frequent item set to find top-k competitor using Apriori algorithm.

Key-Words / Index Term

Data mining, Web mining, Information Search and Retrieval, K_means clustering, Naïve Bayes Classifier, Rule mining.

References

[1] K. Xu, S. S. Liao, J. Li, and Y. Song, “Mining comparative opinions from customer reviews for competitive intelligence,” Decis.Support Syst., 2011
[2] R. Decker and M. Trusov, “Estimating aggregate consumer preferences from online product reviews,” International Journal of Research in Marketing, vol. 27, no. 4, pp. 293–307, 2010.
[3] Z. Ma, G. Pant, and O. R. L. Sheng, “Mining competitor relationships from online news: A network-based approach,” Electronic Commerce Research and Applications, 2011.
[4] R. Li, S. Bao, J. Wang, Y. Yu, and Y. Cao, “Cominer: An effective algorithm for mining competitors from the web,” in ICDM, 2006.
[5] Z. Ma, G. Pant, and O. R. L. Sheng, “Mining competitor relationships from online news: A network-based approach,” Electronic Commerce Research and Applications, 2011.
[6] R. Li, S. Bao, J. Wang, Y. Liu, and Y. Yu, “Web scale competitor
discovery using mutual information,” in ADMA, 2006
[7] C. W.-K. Leung, S. C.-F. Chan, F.-L. Chung, and G. Ngai, “A probabilistic rating inference framework for mining user preferences from reviews,” World Wide Web, vol. 14, no. 2, pp. 187–215, 2011.
[8] E. Marrese-Taylor, J. D. Vel´asquez, F. Bravo-Marquez, and Y. Matsuo, “Identifying customer preferences about tourism products using an aspect-based opinion mining approach,” Procedia Computer Science, vol. 22, pp. 182–191, 2013.
[9] R. Li, S. Bao, J. Wang, Y. Liu, and Y. Yu, “Web scale competitor discovery using mutual information,” in ADMA, 2006.
[10] S. Bao, R.Li,Y.Yu,andY.Cao, “Competitorminingwiththeweb,” IEEE Trans. Knowl. Data Eng., 2008.
[11] G. Pant and O. R. L. Sheng, “Avoiding the blind spots: Competitor identiﬁcation using web text and linkage structure,” in ICIS, 2009.
[12] D. Zelenko and O. Semin, “Automatic competitor identiﬁcation from public information sources,” International Journal of Computational Intelligence and Applications, 2002.
[13] R. Decker and M. Trusov, “Estimating aggregate consumer preferences from online product reviews,” International Journal of Research in Marketing, vol. 27, no. 4, pp. 293–307, 2010.
[14] K. Lerman, S. Blair-Goldensohn, and R. McDonald, “Sentiment summarization: evaluating and learning user preferences,” in ACL, 2009, pp. 514–522.
[15] C.-T. Ho, R. Agrawal, N. Megiddo, and R. Srikant, “Range queries in olap data cubes,” in SIGMOD, 1997, pp. 73–88.
[16]Y.L Wu, D. Agrawal, and A. ElAbbadi, “Using wavelet decomposition to support progressive and approximate range-sum queries over data cubes,” in CIKM, ser. CIKM ’00, 2000, pp. 414–421.
[17] D. Gunopulos, G. Kollios, V. J. Tsotras, and C. Domeniconi, “Approximating multi-dimensional aggregate range queries over real attributes,” in SIGMOD, 2000, pp. 463–474.
[18] M. Muralikrishna and D. J. DeWitt, “Equi-depth histograms for estimating selectivity factors for multi-dimensional queries,” in SIGMOD, 1988, pp. 28–36.
[19] N. Thaper, S. Guha, P. Indyk, and N. Koudas, “Dynamic multidimensional histograms,” in SIGMOD, 2002, pp. 428–439.
[20] K.-H. Lee, Y.-J. Lee, H. Choi, Y. D. Chung, and B. Moon, “Parallel data processing with mapreduce: a survey,” AcM sIGMoD Record, vol. 40, no. 4, pp. 11–20, 2012.
[21] S.B¨ orzs¨onyi, D. Kossmann, and K. Stocker, “The skyline operator,” in ICDE, 2001.

Citations	2325
h-index	16
i10-index	47