Open Access   Article Go Back

Empirical Analysis on Stream Classification & Clustering with Concept Drift in MOA

Hari A. Patel1 , Harsh N. Patel2 , Nirav Bhatt3

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-10 , Page no. 341-345, Oct-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i10.341345

Online published on Oct 31, 2018

Copyright © Hari A. Patel, Harsh N. Patel, Nirav Bhatt . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Hari A. Patel, Harsh N. Patel, Nirav Bhatt, “Empirical Analysis on Stream Classification & Clustering with Concept Drift in MOA,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.10, pp.341-345, 2018.

MLA Style Citation: Hari A. Patel, Harsh N. Patel, Nirav Bhatt "Empirical Analysis on Stream Classification & Clustering with Concept Drift in MOA." International Journal of Computer Sciences and Engineering 6.10 (2018): 341-345.

APA Style Citation: Hari A. Patel, Harsh N. Patel, Nirav Bhatt, (2018). Empirical Analysis on Stream Classification & Clustering with Concept Drift in MOA. International Journal of Computer Sciences and Engineering, 6(10), 341-345.

BibTex Style Citation:
@article{Patel_2018,
author = {Hari A. Patel, Harsh N. Patel, Nirav Bhatt},
title = {Empirical Analysis on Stream Classification & Clustering with Concept Drift in MOA},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {10 2018},
volume = {6},
Issue = {10},
month = {10},
year = {2018},
issn = {2347-2693},
pages = {341-345},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3028},
doi = {https://doi.org/10.26438/ijcse/v6i10.341345}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i10.341345}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3028
TI - Empirical Analysis on Stream Classification & Clustering with Concept Drift in MOA
T2 - International Journal of Computer Sciences and Engineering
AU - Hari A. Patel, Harsh N. Patel, Nirav Bhatt
PY - 2018
DA - 2018/10/31
PB - IJCSE, Indore, INDIA
SP - 341-345
IS - 10
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
430 267 downloads 313 downloads
  
  
           

Abstract

Stream data processing is the next ‘big thing’ in big data which is one of the most propagating fields in computer science. The stream data analytics is an important aspect while dealing with data stream mining. While dealing with the classification of stream data, concept drift and its effect are required to be considered. Massive online analysis (MOA) is one of the most popular tools to perform analytics on stream data. We primarily deal with three features which are provided by moa namely classification, clustering & concept drift. The key emphasis is on experimental analysis on the combination of different procedures and learner algorithm which are suited for training the model so it can be used for the prediction purpose. Besides that, we have also tried to identify drift (change) in data and its effect on performance. So conceptually after taking proper measures about the noise and drifts, we can construct a model which is persistent to all the changes it faces. Stream analytics also required exploring the different clustering techniques which have a wide number of applications. We have presented all the empirical analysis carried out on classification and clustering techniques in a tool called MOA.

Key-Words / Index Term

Stream Processing, Concept Drift, classification, Clustering, MOA

References

[1]. Bifet, A., Holmes, G., Pfahringer, B., Read, J., Kranen, P., Kremer, H., ... & Seidl, T. (2011, September). MOA: a real-time analytics open source framework. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 617-620). Springer, Berlin, Heidelberg.
[2]. Kranen, P., Kremer, H., Jansen, T., Seidl, T., Bifet, A., Holmes, G., & Pfahringer, B. (2010, December). Clustering performance on evolving data streams: Assessing algorithms and evaluation measures within MOA. In Data Mining Workshops (ICDMW), 2010 IEEE International Conference on (pp. 1400-1403). IEEE.
[3]. Kranen, P., Kremer, H., Jansen, T., Seidl, T., Bifet, A., Holmes, G., ... & Read, J. (2012, April). Stream data mining using the MOA framework. In International Conference on Database Systems for Advanced Applications (pp. 309-313). Springer, Berlin, Heidelberg.
[4]. Jani, R., Bhatt, N., & Shah, C. (2017, March). A Survey on Issues of Data Stream Mining in Classification. In International Conference on Information and Communication Technology for Intelligent Systems (pp. 137-143). Springer, Cham.
[5]. Gama, J. (2010). Knowledge discovery from data streams. Chapman and Hall/CRC.
[6]. Tsymbal, A. (2004). The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin, 106(2).
[7]. Hassani, M., Kim, Y., & Seidl, T. (2013, April). Subspace MOA: subspace stream clustering evaluation using the MOA framework. In International Conference on Database Systems for Advanced Applications (pp. 446-449). Springer, Berlin, Heidelberg.
[8]. Ahmed, M. (2018). Data summarization: a survey. Knowledge and Information Systems, 1-25.
[9]. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The WEKA data mining software: an update. SIGKDD Explor. Newsl., 11(1):10–18, 2009.
[10]. Domingos, P., & Hulten, G. (2000, August). Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 71-80). ACM.
[11]. Conference on Knowledge Discovery and Data Mining, pages 71–80, 2000.
[12]. Al-Radaideh, Q. A., & Al Nagi, E. (2012). Using data mining techniques to build a classification model for predicting employees performance. International Journal of Advanced Computer Science and Applications, 3(2).
[13]. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM computing surveys (CSUR), 46(4), 44.
[14]. Hoens, T. R., Polikar, R., & Chawla, N. V. (2012). Learning from streaming data with concept drift and imbalance: an overview. Progress in Artificial Intelligence, 1(1), 89-101.
[15]. Kremer, H., Kranen, P., Jansen, T., Seidl, T., Bifet, A., Holmes, G., & Pfahringer, B. (2011, August). An effective evaluation measure for clustering on evolving data streams. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 868-876). ACM.
[16]. Ibanez, A. C. (2017). Introduction to Stream Mining.
[17]. Devi, Y. S., & Nagababu, G. (2017). Comparison of Clustering Algorithms in Data stream mining: A Literature Survey.
[18]. Dipti, M., & Patel, T. (2014). K-means based data stream clustering algorithm extended with no. of cluster estimation method. International Journal of Advance Engineering and Research Development (IJAERD), 1(6).
[19]. Silva, J. A., Faria, E. R., Barros, R. C., Hruschka, E. R., De Carvalho, A. C., & Gama, J. (2013). Data stream clustering: A survey. ACM Computing Surveys (CSUR), 46(1), 13.
[20]. Wilks, D. S. (2011). Cluster analysis. In International geophysics (Vol. 100, pp. 603-616). Academic press.
[21]. Parikh, D., & Tirkha, P. (2013). Data mining & data stream mining—open source tools. International Journal of Innovative Research in Science, Engineering and Technology, 2(10), 5234-5239.
[22]. Fernandes, M. (2017). Data Mining: A Comparative Study of its Various Techniques and its Process. International Journal of Scientific Research in Computer Science and Engineering, 5(1), 19-23.