Open Access   Article Go Back

Performance analysis of PFI using FP-Growth algorithm for various data-sets

Shital A. Patil1 , Amol Potgantwar2

Section:Research Paper, Product Type: Journal Paper
Volume-4 , Issue-1 , Page no. 43-50, Jan-2016

Online published on Jan 31, 2016

Copyright © Shital A. Patil, Amol Potgantwar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Shital A. Patil, Amol Potgantwar, “Performance analysis of PFI using FP-Growth algorithm for various data-sets,” International Journal of Computer Sciences and Engineering, Vol.4, Issue.1, pp.43-50, 2016.

MLA Style Citation: Shital A. Patil, Amol Potgantwar "Performance analysis of PFI using FP-Growth algorithm for various data-sets." International Journal of Computer Sciences and Engineering 4.1 (2016): 43-50.

APA Style Citation: Shital A. Patil, Amol Potgantwar, (2016). Performance analysis of PFI using FP-Growth algorithm for various data-sets. International Journal of Computer Sciences and Engineering, 4(1), 43-50.

BibTex Style Citation:
@article{Patil_2016,
author = {Shital A. Patil, Amol Potgantwar},
title = {Performance analysis of PFI using FP-Growth algorithm for various data-sets},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {1 2016},
volume = {4},
Issue = {1},
month = {1},
year = {2016},
issn = {2347-2693},
pages = {43-50},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=778},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=778
TI - Performance analysis of PFI using FP-Growth algorithm for various data-sets
T2 - International Journal of Computer Sciences and Engineering
AU - Shital A. Patil, Amol Potgantwar
PY - 2016
DA - 2016/01/31
PB - IJCSE, Indore, INDIA
SP - 43-50
IS - 1
VL - 4
SN - 2347-2693
ER -

VIEWS PDF XML
1833 1680 downloads 1568 downloads
  
  
           

Abstract

The data handled in appearing applications like placement or situation based services, sensor monitoring systems, and data integration, are often not exact in nature. In this article, we study the important problem of extracting frequent item sets[1] from a huge unsure database, illuminated under the Possible World Semantics (PWS)[2].This issue is technically challenging, since an unsure database consist an exponential number of possible worlds. By observing that the mining process can be show as a discrete probability distribution, we develop an FP Growth algorithm [4] which compress a large database into a dense, Frequent-Pattern tree (FP-tree) [4] structure also Develop an efficient, FP-tree-based frequent pattern mining method (FP-growth) and Apriori algorithm for frequent item set mining. We also study the important problem of maintaining the mining result for a database that is developing (e.g. by inserting a tuple). Specifically, we present incremental mining algorithms [13], which enable Probabilistic repeated Item set (PFI) results to be refreshed. This decrease the requirement of re-executing the whole mining algorithm on the new database, which is often more expensive and unnecessary. We observe how an existing algorithm that retrieves exact item sets, as well as our approximate algorithm, can support incremental mining. All our algorithms support both tuple and attribute uncertainty, which are two common uncertain database models. We also perform huge evaluation on real and synthetic data sets to validate our approaches.

Key-Words / Index Term

Frequent Item Sets, Uncertain Data Set, FP Growth Algorithm,Association Rule Mining,Apriori Algorithm

References

[1] Liang Wang, David Wai-Lok Cheung, Reynold Cheng, S Lee, X Yung,“Efficient Mining of Frequent Item Sets onLarge Uncertain Databases”,IEEE Trans Knowledge and Data Eng.,2012 pp.110-115.
[2] A. Veloso, W. Meira Jr., M. de Carvalho,S.Parthasarathy, and M.J. Zaki, “Mining Frequent Itemsets inEvolving Databases”,Proc. Second SIAM Int’l Conf. Data Mining(SDM), 2002 pp.210-215.
[3] C. Aggarwal, Y. Li, J. Wang, and J. Wang, “Frequent PatternMining with Uncertain Data”,Proc. 15th ACM SIGKDD Int’l Conf.Knowledge Discovery and Data Mining (KDD), 2009 pp.160-170.
[4] C. Aggarwal and P. Yu, “A Survey of Uncertain Data Algorithms and Applications”,IEEE Trans Knowledge and Data Eng., vol. 21,no. 5, May 2009,pp. 609-623.
[5] R. Aggrawal, T. Imieli_nski, and A. Swami, “Mining AssociationRules between Sets of Items in Large Databases”,Proc. ACMSIGMOD Int’l Conf. Management of Data, 1993 pp.348-255.
[6] O. Benjelloun, A.D. Sarma, A. Halevy, and J. Widom, “ULDBs:Databases with Uncertainty and Lineage”,Proc. 32nd Int’l Conf.Very Large Data Bases (VLDB), 2006 pp.440-445.
[7] T. Bernecker, H. Kriegel, M. Renz, F. Verhein, and A. Zuefle,“Probabilistic Frequent Itemset Mining in Uncertain Databases”,Proc. 15th ACM SIGKDD Int’l Conf. Knowledge Discovery and DataMining (KDD), 2009 pp.523-530
[8] L.L. Cam, “An Approximation Theorem for the Poisson Binomial Distribution”,Pacific J. Math., vol. 10, ,1990, pp. 1181-1197.
[9] H. Cheng, P. Yu, and J. Han, “Approximate Frequent ItemsetMining in the Presence of Random Noise”,Proc. Soft Computing forKnowledge Discovery and Data Mining, 2008,pp. 363-389.
[10] R. Cheng, D. Kalashnikov, and S. Prabhakar, “EvaluatingProbabilistic Queries over Imprecise Data”,Proc. ACM SIGMODInt’l Conf. Management of Data, 2003 pp.789-792.
[11] D. Cheung, J. Han, V. Ng, and C. Wong, “Maintenance ofDiscovered Association Rules in Large Databases: An IncrementalUpdating Technique”,Proc. 12th Int’l Conf. Data Eng. (ICDE), 1996 pp.890-892.
[12] D. Cheung, S.D. Lee, and B. Kao, “A General IncrementalTechnique for Maintaining Discovered Association Rules”,Proc.Fifth Int’l Conf. Database Systems for Advanced Applications(DASFAA), 1997 pp.267-270.
[13] W. Cheung and O.R. Zaiane, “Incremental Mining of FrequentPatterns with Candidate Generation or Support Constraint”,Proc. Seventh Int’l Database Eng. and Applications Symp. (IDEAS),2003 pp.300-345.
[14] C.K. Chui, B. Kao, and E. Hung, “Mining Frequent Itemsets fromUncertain Data”,Proc. 11th Pacific-Asia Conf. Advances in KnowledgeDiscovery and Data Mining (PAKDD), 2007 pp.867-870.
[15] G. Cormode and M. Garofalakis, “Sketching Probabilistic DataStreams”,Proc. ACM SIGMOD Int’l Conf. Management of Data,2007.
[16] M Adnan, R Alhajj, K Barker - Advances in Artificial Intelligence “Costructing complete FP tree for incremental mining of frequent patterns in dynamic database”, 2006 – Springer
[17] J. Han, M. Kamber, “Data Mining Concepts and Techniques”, 3rd edition, Morgan Kaufmann Publishers, San Francisco, USA, ISBN 9780123814791, 2012,pp. 243-262.
[18] Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz,FlorianVerhein,AndreasZuefle,”Probabilisttc Frequent Itemset Mining in Uncertain Databases”
In Proc.11th Int. Conf. on Knowledge Discovery and Data Mining (KDD'09),Paris, France,pp.300-365,2009.
[19] Leung, C.K.-S., Carmichael, C.L., Hao, B.: Efficient mining of frequent patterns from uncer-tain data. In: Proc. IEEE ICDM Workshops, pp. 489–494 ,2007
[20] C.-K. Chui, B. Kao, E. Hung. Mining Frequent
Itemsets from Uncertain Data.PAKDD.pp.500-512, 2007
[21] H. Cheng, P. Yu, and J. Han.”Approximate frequent itemset mining in the presence of random noise”.Soft Computing for Knowledge Discovery and Data Mining, pp.612-615,2008
[22] Carson Kai-Sang Leung,Quamrul I. Khan ,Tariqul Hoque.” CanTree: A Tree Structure for Efficient Incremental Mining of Frequent Patterns”pp.700-712,2005
[23] Charu C. Aggarwal,Senior Member, and Philip S. Yu,Fellow “A Survey of Uncertain Data Algorithms and Applications“conf on ieee transactions on knowledge and data engineering, vol. 21,may 2005 pp.523-540.
[24] Jianxiong Luo, Susan M.Bridges”Mining fuzzy Association rules and fuzzy frequency episodes for intrusion detection”pp.812-816.August,2000.
[25] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. The design of an acquisitional query processor for sensor networks. In ACM SIGMOD, pp.435-440,2003