Text Mining Using Frequent Pattern Analysis and Message Passing

M. Deeba, Mary Immaculate Sheela

Open Access Article Go Back

Text Mining Using Frequent Pattern Analysis and Message Passing

M. Deeba¹ , Mary Immaculate Sheela²

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-2 , Page no. 658-667, Feb-2019

CrossRef-DOI: https://doi.org/10.26438/ijcse/v7i2.658667

Online published on Feb 28, 2019

Copyright © M. Deeba, Mary Immaculate Sheela . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: M. Deeba, Mary Immaculate Sheela, “Text Mining Using Frequent Pattern Analysis and Message Passing,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.2, pp.658-667, 2019.

MLA Style Citation: M. Deeba, Mary Immaculate Sheela "Text Mining Using Frequent Pattern Analysis and Message Passing." International Journal of Computer Sciences and Engineering 7.2 (2019): 658-667.

APA Style Citation: M. Deeba, Mary Immaculate Sheela, (2019). Text Mining Using Frequent Pattern Analysis and Message Passing. International Journal of Computer Sciences and Engineering, 7(2), 658-667.

BibTex Style Citation:
@article{Deeba_2019,
author = {M. Deeba, Mary Immaculate Sheela},
title = {Text Mining Using Frequent Pattern Analysis and Message Passing},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2019},
volume = {7},
Issue = {2},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {658-667},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3722},
doi = {https://doi.org/10.26438/ijcse/v7i2.658667}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i2.658667}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3722
TI - Text Mining Using Frequent Pattern Analysis and Message Passing
T2 - International Journal of Computer Sciences and Engineering
AU - M. Deeba, Mary Immaculate Sheela
PY - 2019
DA - 2019/02/28
PB - IJCSE, Indore, INDIA
SP - 658-667
IS - 2
VL - 7
SN - 2347-2693
ER -

VIEWS	PDF	XML
280	250 downloads	98 downloads

Bar Line

Abstract

Text mining is a Computer Science technique to analyze text data. Text mining is text analysis, is the process of deriving high quality information from text. Text mining is to convert text into data for suitable analysis. It allows us to investigate relationship among patterns which would otherwise be extremely difficult. Various techniques are used to mining the frequent patterns in the given text which are applicable to analyze the information in huge documents. The parallel construction of FP-Trees and parallel mining on multi cores is a popular tree projection based mining algorithm. Once each processor counts the frequency of each item using its local data partition, all worker processors send the local count to the master processor which combines them and generate global count. The parallel implementation of FP-tree may show good speedups but sending the local results to master on distributed environment and merging the patterns count on master core are overhead which consumes a considerable time. This study aims at to analyze various frequent pattern mining techniques used to extract information from texts especially on multi cores and going to adopt a new technique for finding frequent patterns, which used the Dictionary based compression algorithm(LZW). The new technique is implemented with single processor as so as with multi processor using message passing technique. The main objective of this research is enhancing the speed and reduce the memory consumption required to extract the frequent patterns form the given textual data. The parallel implementation of our proposed LZW based algorithm with three datasets Webdoc, Kosarak and Trump is compared with parallel implementation of FP-Growth on single and multi core. The results shows good performance in speedup, Latency and Efficiency in proposed LZW based algorithm.

Key-Words / Index Term

Parallel FP-Growth, Frequent Keywords Mining, Multi core Systems

References

[1] Krishna Gadia & Kiran Bhowmick, ‘Parallel text mining in multi core systems using FP-Tree algorithm’, ScienceDirect Procedia Computer Science 45(2015)111-117, 2015
[2] S.K. Tanbeer, C.F. Ahmed, B.S. Jeong, ’Parallel and distributed frequent pattern mining in large databases’, in: Proceeding of the 11th IEEE International Conference on High Performance Computing and Communications, pp. 407–414, 2009
[3] R. Agrawal, R. Srikant, ’ Fast algorithms for mining association rules’, in: Proceedings of the 20th International Conference on Very Large Databases, , pp. 487–499, 1994.
[4] E. H. Han, G. Karypis, & V. Kumar.’ Scalable parallel data mining for association rules’,IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 3,2000
[5] J. Han, J. Pei, and Y.Yin. Mining Frequent Patterns without Candidate Generation. In ACM SIGMOD, 2000.
[6] R. Rabenseifner, G. Hager & G. Jost,2009,’ Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes’, in: Proceeding of the 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing (Feb. 2009), pp. 427–436.
[7] R. Garg & P.K. Mishra,2009,’Some observations of sequential, parallel and distributed association rule mining algorithms’, In: IEEE Proceeding of the International Conference on Computer and Automation Engineering (March 2009), pp. 336–342.
[8] D. Chen, C. Lai, W. Hu, W. Chen, Y. Zhang & W. Zheng, 2006,’ Tree partition based parallel frequent pattern mining on shared memory systems’, in: Proceeding of the 20th International Conference on Parallel and Distributed Processing, pp. 313–320.27.
[9] Lan Vu & Gita Alaghband, 2014, ‘Novel parallel method for association rule mining on multi-core shared memory systems’, ELSEVIER, Parallel computing 40(2014)768-785.
[10] Vu, G. Alaghband, 2012.’ Mining frequent patterns based on data characteristics’, in: Proceedings of the International Conference on Information and Knowledge Engineering, pp. 369–375.20.
[11] CC Aggarwal, 2007, ‘Data streams, models and algorithms’, Springer Science + Business media, books.google.com
[12] Krishna Gadia & Kiran Bhowmick, 2015, ‘Parallel text mining in multi core systems using FP-Tree algorithm’, ScienceDirect Procedia Computer Science 45(2015)111-117.
[13] J.S.Park, M.S.Chen & P.Yu,1995,’ An effective Hash based algorithm for mining association rules’, in Proc: ACM SIGMOD international conference on management of Data, Vol24, pp. 175-186.
[14] H.Mannila, H.Tojvonen & A.I. Verkamo, 1997,’Discovery of frequent episodes in event sequences’, Data Min. Knowl. Discovery 1(3)259-289.

Citations	2325
h-index	16
i10-index	47