Open Access   Article Go Back

Analytical Study of Association Rule Mining Algorithm for Retrieving Frequent Itemsets in Big Datasets

Sachin Kumar Pandey1

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-7 , Page no. 424-436, Jul-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i7.424436

Online published on Jul 31, 2018

Copyright © Sachin Kumar Pandey . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Sachin Kumar Pandey , “Analytical Study of Association Rule Mining Algorithm for Retrieving Frequent Itemsets in Big Datasets,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.424-436, 2018.

MLA Style Citation: Sachin Kumar Pandey "Analytical Study of Association Rule Mining Algorithm for Retrieving Frequent Itemsets in Big Datasets." International Journal of Computer Sciences and Engineering 6.7 (2018): 424-436.

APA Style Citation: Sachin Kumar Pandey , (2018). Analytical Study of Association Rule Mining Algorithm for Retrieving Frequent Itemsets in Big Datasets. International Journal of Computer Sciences and Engineering, 6(7), 424-436.

BibTex Style Citation:
@article{Pandey_2018,
author = {Sachin Kumar Pandey },
title = {Analytical Study of Association Rule Mining Algorithm for Retrieving Frequent Itemsets in Big Datasets},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {7 2018},
volume = {6},
Issue = {7},
month = {7},
year = {2018},
issn = {2347-2693},
pages = {424-436},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2452},
doi = {https://doi.org/10.26438/ijcse/v6i7.424436}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i7.424436}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2452
TI - Analytical Study of Association Rule Mining Algorithm for Retrieving Frequent Itemsets in Big Datasets
T2 - International Journal of Computer Sciences and Engineering
AU - Sachin Kumar Pandey
PY - 2018
DA - 2018/07/31
PB - IJCSE, Indore, INDIA
SP - 424-436
IS - 7
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
401 285 downloads 140 downloads
  
  
           

Abstract

Information retrieval as an executive Demas extensible as a technique near procedure as takeout applicable information use for Big Information. Information mining as advanced study big extent information near concludes original information using sketch model, leaning, as a associations. Among the extend World Wide Web, this digit information lay up as a completed obtainable by machine amplified enormously, as a technique near retrieve information as about big information grow enormous consequence used for business, scientific as a engineering do research community. Frequent Itemset Mining individual the majority widely functional measures near retrieve about use information from information. Nonetheless, as its technique be useful near Big Information, combinatorial eruption cuspidate itemsets has grown to be challenge. A current growth use neighborhood about parallel programming obtainable outstays apparatus near conquer difficulty. However, apparatus include possess scientific disadvantage, for example impartial information allocation as an inter-communication expenses. During advance study, we scrutinize request about Frequent Itemset Mining using MapReduce framework. We bring in original technique used for takeout big informationsets: Big-Frequent-Itemset Mining. Its technique optimized near sprint lying on extremely big informationsets. Come near comparable consequently, we apply a dispersed association rule mining algorithm lying on big information set forename as a Genetic Algorithm as a Adaptive-Miner which utilize adaptive approach used for judgment frequent patterns among superior accurateness as a competence. Adaptive-Miner utilizes adaptive approach based lying on the fractional processing informationsets. Adaptive-Miner constructs implementation strategy previous to all iteration as a go away among top appropriate strategy reduce time as a space complexity. Adpative-Miner is dynamic association rule mining algorithms adjust this come near based lying on scenery about informationset. Consequently, this dissimilar as enhanced modern static association rule mining algorithms. We behavior techniqueically research near increase approaching keen on efficiency, as a scalability about Adaptive-Miner algorithm lying on big informationset. use its research’s, we exhibit scalability about techniques.

Key-Words / Index Term

Genetic Algorithm, association rule mining algorithm, association rules; big data sets; frequent pattern mining; map reduce.

References

[1]Wang T, Rudin C, Wagner D, Sevieri R. Learning to detect patterns of crime. In: European conference on machine learning and principles and practice of knowledge discovery in databases. 2013.
[2]Amsterdamer Y, Grossman Y, Milo T, Senellart P. Crowdminer: mining association rules from the crowd. Proc VLDB Endow. 2013;6(12):1250–3. https://doi.org/10.14778/2536274.2536288
[3]Amsterdamer Y, Grossman Y, Milo T, Senellart P. Crowd mining. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. SIGMOD ’13. New York: ACM; 2013. p. 241–52. https://doi.org/10.1145/2463676.2465318.
[4]Naulaerts S, Meysman P, Bittremieux W, Vu TN, Vanden Berghe W, Goethals B, Laukens K. A primer to frequent item- set mining for bioinformatics. Brief Bioinform.2015;16(2):216. https://doi.org/10.1093/bib/bbt074.
[5] Li J, Roy P, Khan SU, Wang L, Bai Y. Data mining using clouds: an experimental implementation of Apriori over Mapre- duce. In: 12th international conference on scalable computing and communications (ScalCom’13). 2012. p. 1–8.
[6] Qiu H, Gu R, Yuan C, Huang Y. Yafim: a parallel frequent itemset mining algorithm with spark. In: IEEE international parallel distributed processing symposium workshops. 2014. p. 1664–71. https://doi.org/10.1109/IPDPSW.2014.185.
[7] Rathee S, Kaul M, Kashyap A. R-Apriori: an efficient apriori based algorithm on spark. In: Proceedings of the 8th workshop on Ph.D. workshop in information and knowledge management. PIKM 15. Melbourne: ACM; 2015. p. 27–34. https://doi.org/10.1145/2809890.2809893.
[8] Farzanyar Z, Cercone N. Efficient mining of frequent itemsets in social network data based on mapreduce frame- work. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’13. New York: ACM; 2013. p. 1183–8.https://doi.org/10.1145/2492517.2500301.
[9] Moans S, Aksehirli E, Goethals B. Frequent itemset mining for big data. In: Proceedings of IEEE international confer- ence on big data. 2013. p. 111–8.https://doi.org/10.1109/BigData.2013.6691742.
[10] Origami S, Ding Q, Tabrizi N. Exploring hadoop as a platform for distributed association rule mining. In: FUTURE COMPUTING 2013-the fifth international conference on future computational technologies and applications. 2013. p. 62–7.
[11] Yong W, Zhe Z, Fang W. A parallel algorithm of association rules based on cloud computing. In: Proceedings of 8th international conference on communications and networking in China (CHINACOM). 2013. p. 415–9. https://doi. org/10.1109/ChinaCom.2013.6694632.
[12] Lin X. Mr-apriori: association rules algorithm based on mapreduce. In: Proceedings of IEEE 5th international confer-ence on software engineering and service science. 2014. p. 141–4. https://doi.org/10.1109/ICSESS.2014.6933531.
[13] Barkhordari M, Niamanesh M. Scadibino: an efective mapreduce-based association rule mining method. In: Proceedings of the sixteenth international conference on electronic commerce. ICEC ’14. New York: ACM; 2014. p. 1–118. https://doi.org/10.1145/2617848.2617853.
[14] Singh S, Garg R, Mishra P. Performance analysis of apriori algorithm with different data structures on hadoop cluster. 2015. Arrive preprint arXiv: 1511.07017.
[15] Zhang F, Liu M, Guy F, Sheen W, Shamir A, Ma Y. A distributed frequent itemset mining algorithm using spark for big data analytics. Cults Compute. 2015; 18(4):1493–501.
[16] FIMI. FIMI datasets. FIMI. 2017. http://fimi.ua.ac.be/data/. Accessed 2 Jan 2017.
[17] SPMF. SPMF: a java open-source data mining library. SPMF. 2017. http://www.philippe foumier-http://www.philippe-foumier-viger.com/spmf/index.php?link=datasets.php.accessed 2 Jan 2017.
[18] Alfredo Cuzzocrea, Carson Kai-Sang Leung, Richard Kyle MacKinnon. Mining constrained frequent itemsets from distributed uncertain data. Future Generation Computer Systems. 2014; 37:117-126.
[19] DsonDela Cruz, Carson Kai-Sang Leung, Fan Jiang. Mining `following` patterns from big sparse social networks. In Proceedings of the International Symposium on Foundations and Applications of Big Data Analytics (FAB 2016),San Francisco, CA, USA. ACM. 2016; 923-930.
[20] Kun He, Yawed Sun, David Bindle, John E.Hopcroft, Yamuna Li. Detecting overlapping communities from local Spectral subspaces. In 2015 IEEE International Conference on Data Mining (ICDM 2015), Atlantic City, NJ, USA. 2015; 769-774.
[21] Yuan Chen, Xiang Zhao, Xiamen Lin, Yang Wang. Towards frequent sub graph mining on single large uncertain graphs. In 2015 IEEE International Conference on Data Mining (ICDM 2015), Atlantic Citing, USA.2015; 41-50.
[22] Fan Jiang, Carson Kai-Sang Leung, Dashing Liu, Aaron M. Peddle. Discovery Dashing Liu, Aaron M. of really popular friends from social networks. In Proceedings of the 4th IEEE International Conference on Big Data and Cloud Computing (BD Cloud 2014), Sydney, Australia. 2014;342-349.
[23] Dhanalakshmi. D and Dr. J. Komala Lakshmi, “A Survey on Data Mining Research Trends”, A Survey on Data Mining Research Trends, Volume 3, Issue 10 October, 2014 Page No. 8911-8919
[24] Rena Ishtar and Rena Ishtar, “Frequent Itemset Mining in Data Mining: A Survey”, International Journal of Computer Applications (IJCA), Volume 139 – No.9, April 2016.
[25] Sanjaydeep Singh Lodhi and Premnarayan Arya,“Frequent Itemset Mining Technique in Data Mining”, International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 5, July 2012.
[26] Bourget, Christian. "Frequent item set mining", Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2.6(2012): pp. 437-456.
[27] Ghosting a, Kambadur P, Penult E, and Kennan R. NIMBLE: a toolkit for the implementation of parallel data mining and machine learning algorithms on mapreduce. In Proc. ACM SIGKDD, ACM. 2011; 334–342.
[28] Zhou L, Zhan Z, Chang J, Li J, Huang JZ, Fen S. Balanced parallel fop-growth with mapreduce. In: 2010 IEEE youth conference on information, computing and telecommunications. 2010. p. 243–6. https://doi.org/10.1109/YCICT.2010.5713090.
[29] Yang XY, Liu Z, Fu Y. Mapreduce as a programming model for association rules algorithm on hadoop. In: Proceedings of the 3rd international conference on information sciences and interaction sciences. 2010. p. 99–102. https:// doi.org/10.1109/ICICIS.2010.5534718.
[30] Li L, Zhang M. The strategy of mining association rule based on cloud computing. In: Proceeding of international conference on business computing and global informatization. 2011. p. 475–8. https://doi.org/10.1109/BCGIn .2011.125.
[31] Yu H, Win J, Wang H, Jun L. An improved apriori algorithm based on the Boolean matrix and hadoop. Proscenia Eng. 2011; 15:1827–31
[32] Cheung DW, Han J, Ng VT, Fu AW, Fu Y. A fast distributed algorithm for mining association rules. In: Proceeding of fourth International conference on parallel and distributed information systems. 1996. p. 31–42. https://doi.org/10.1109/PDIS.1996.568665