Open Access   Article Go Back

An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams

Caiyan Dai1 , Ling Chen2

Section:Research Paper, Product Type: Journal Paper
Volume-4 , Issue-2 , Page no. 40-48, Feb-2016

Online published on Feb 29, 2016

Copyright © Caiyan Dai , Ling Chen . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Caiyan Dai , Ling Chen, “An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams,” International Journal of Computer Sciences and Engineering, Vol.4, Issue.2, pp.40-48, 2016.

MLA Style Citation: Caiyan Dai , Ling Chen "An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams." International Journal of Computer Sciences and Engineering 4.2 (2016): 40-48.

APA Style Citation: Caiyan Dai , Ling Chen, (2016). An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams. International Journal of Computer Sciences and Engineering, 4(2), 40-48.

BibTex Style Citation:
@article{Dai_2016,
author = {Caiyan Dai , Ling Chen},
title = {An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2016},
volume = {4},
Issue = {2},
month = {2},
year = {2016},
issn = {2347-2693},
pages = {40-48},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=792},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=792
TI - An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams
T2 - International Journal of Computer Sciences and Engineering
AU - Caiyan Dai , Ling Chen
PY - 2016
DA - 2016/02/29
PB - IJCSE, Indore, INDIA
SP - 40-48
IS - 2
VL - 4
SN - 2347-2693
ER -

VIEWS PDF XML
1816 1681 downloads 1656 downloads
  
  
           

Abstract

Mining frequent closed itemsets from data streams is an important topic. In this paper,we propose an algorithm for mining frequent closed itemsets from data streams based on a time fading module. By dynamically constructing a pattern tree, the algorithm calculates densities of the itemsets in the pattern tree using a fading factor. The algorithm deletes real infrequent itemsets from the pattern tree so as to reduce the memory cost. A density threshold function is designed in order to identify the real infrequent itemsets which should be deleted. Using such density threshold function, deleting the infrequent itemsets will not affect the result of frequent itemset detecting. The algorithm modifies the pattern tree and detects the frequent closed itemsets in a fixed time interval so as to reduce the computation time. We also analyze the error caused by deleting the infrequent itemsets. The experimental results indicate that our algorithm can get higher accuracy results, needs less memory and computation time than other algorithm

Key-Words / Index Term

Data Streams; Frequent Closed Itemsets; Data Mining; Time Fading Model

References

[1] Y.H. Pan, J.L. Wang, and C.F. Xu, “State-of-the-art on frequent pattern mining in data streams, ” Acta Automatica Sinica, Vol.32, Issue-4, 2006, pp.594-602.
[2] Y.Y. Zhu, S.S. Dennis, “StatStream: statistical monitoring of thousands of data streams in real time [A]”, Proceedings of the 20th International Conference on Very Large Data Bases[C]. Hong Kong, China, 2002, pp. 358-369.
[3] H.F. Li, C.C. Ho and S.Y. Lee, “Incremental updates of closed frequent itemsets over continuous data streams”, Expert Systems with Applications, Vol.36, 2009, pp. 2451-2458.
[4] J. Nan and G. Le, “Research issues in data stream association rule mining”, SIGMOD Record, Vol.35, Issue-1, 2006, pp. 14-19.
[5] Y. Chi etal, “Catch the moment: Maintaining closed frequent itemsets over a data stream sliding window,” Knowledge and Information Systems, Vol.10, Issue-3, 2006, pp. 265-294.
[6] F.J. Ao etal, “An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams,” IEEE 8th International Conference on Computer and Information Technology Workshops, 2008, pp. 37-42.
[7] J.Y. Wang etal, “TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets,” IEEE TRANSACTION ON KNOWLEDGE AND DATA ENGINEERING, Vol.17, 2005, pp. 652-664.
[8] Y. Chi etal, “MOMENT: Maintaining closed frequent Itemsets over a stream sliding window [A]”, Proceedings of the 2004 IEEE International Conference on Data Mining[C], Brighton, UK: IEEE Computer Society Press, 2004, pp. 59-66.
[9] X. Liu etal, “An Algorithm to Approximately Mine Frequent Closed Itemsets from Data Streams”, ACTA ELECTRONICA SINICA, Vol.35, Issue-5, 2007, pp. 900-905.
[10] X. Ji, J. Bailey, “An Efficient Technique for Mining Approximately Frequent Substring Patterns”, Data Mining Workshops, ICDM Workshops Seventh IEEE International Conference, 2007 , pp. 325-330.
[11] S. Zhong, “Efficient stream text clustetring[J]”, Neural Networks, Vol.18, Issue-6, 2006, pp.790-798.
[12] H. F. Li, Z. J. Lu, H. Chen, “Mining Approximate Closed Frequent Itemsets over Stream,” Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, Ninth ACIS International Conference, 2008, pp. 405-410.
[13] H. Yan, Y.S. Sang, “Approximate frequent itemsets compression using dynamic clustering method,” Cybernetics and Intelligent Systems, IEEE Conference, 2008 , pp. 1061-1066.
[14] Z. N. Zou etal, “Mining Frequent Subgraph Patterns from Uncertain Graph Data,” Knowledge and Data Engineering, Vol.22, Issue-9, 2010, pp. 1203 -1218.
[15] C. Andrea, P. Rasmus, “On Finding Similar Items in a Stream of Transactions,” Data Mining Workshops (ICDMW), IEEE International Conference, 2010 , pp. 121-128.
[16] X. N. Ji, J. Bailey, “An Efficient Technique for Mining Approximately Frequent Substring Patterns,” Data Mining Workshops, Seventh IEEE International Conference, 2007, pp. 325-330.
[17] B. Bakariya and G.S. Thakur. “Effectuation of Web Log Preprocessing and Page Access Frequency using Web Usage Mining”, Vol.1 , Issue-01, 2016, pp.1-5.