Open Access   Article Go Back

Data Mining Techniques used in Software Engineering: A Survey

Nidhin Thomas1 , Atharva Joshi2 , Rishikesh Misal3 , Manjula R4

Section:Survey Paper, Product Type: Journal Paper
Volume-4 , Issue-3 , Page no. 28-34, Mar-2016

Online published on Mar 30, 2016

Copyright © Nidhin Thomas, Atharva Joshi, Rishikesh Misal , Manjula R . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Nidhin Thomas, Atharva Joshi, Rishikesh Misal , Manjula R, “Data Mining Techniques used in Software Engineering: A Survey,” International Journal of Computer Sciences and Engineering, Vol.4, Issue.3, pp.28-34, 2016.

MLA Style Citation: Nidhin Thomas, Atharva Joshi, Rishikesh Misal , Manjula R "Data Mining Techniques used in Software Engineering: A Survey." International Journal of Computer Sciences and Engineering 4.3 (2016): 28-34.

APA Style Citation: Nidhin Thomas, Atharva Joshi, Rishikesh Misal , Manjula R, (2016). Data Mining Techniques used in Software Engineering: A Survey. International Journal of Computer Sciences and Engineering, 4(3), 28-34.

BibTex Style Citation:
@article{Thomas_2016,
author = {Nidhin Thomas, Atharva Joshi, Rishikesh Misal , Manjula R},
title = {Data Mining Techniques used in Software Engineering: A Survey},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {3 2016},
volume = {4},
Issue = {3},
month = {3},
year = {2016},
issn = {2347-2693},
pages = {28-34},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=822},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=822
TI - Data Mining Techniques used in Software Engineering: A Survey
T2 - International Journal of Computer Sciences and Engineering
AU - Nidhin Thomas, Atharva Joshi, Rishikesh Misal , Manjula R
PY - 2016
DA - 2016/03/30
PB - IJCSE, Indore, INDIA
SP - 28-34
IS - 3
VL - 4
SN - 2347-2693
ER -

VIEWS PDF XML
1789 1676 downloads 1545 downloads
  
  
           

Abstract

A typical software development process has several stages; each with its own significance and dependency on the other. Each stage is often complex and generates a wide variety of data. Using data mining techniques, we can uncover hidden patterns from this data, measure the impact of each stage on the other and gather useful information to improve the software development process. The insights gained from the extracted knowledge patterns can help software engineers to predict, plan and comprehend the various intricacies of the project, allowing them to optimize future software development activities. As every stage in the development process entails a certain outcome or goal, it becomes crucial to select the best data mining techniques to achieve these goals efficiently. In this paper, we survey the available data mining techniques and propose the most appropriate techniques for each stage of the development process. We also discuss how data mining improves the software development process in terms of time, cost, resources, reliability and maintainability.

Key-Words / Index Term

Data Mining, Software Engineering, KDD methods, Software Development, Frequent Pattern Mining, Text Mining, Classification, Clustering

References

[1] Laplante, Phillip (2007). “What Every Engineer Should Know about Software Engineering”, Boca Raton: CRC. ISBN 9780849372285.
[2] “Selecting a development approach”, Centers for Medicare & Medicaid Services (CMS) Office of Information Service (2008). Re-validated: March 27, 2008. Retrieved 27 Oct 2015
[3] Nabil Mohammed Ali Munassar1 and A. Govardhan, “A Comparison Between Five Models Of Software Engineering”, IJCSI International Journal of Computer Science Issues, Volume 07, Issue 05, Page No (94-101), September 2010.
[4] Taylor, Q.and Giraud-Carrier, C. “Applications of data mining in software engineering”, International Journal of Data Analysis Techniques and Strategies, Volume 02, Issue 03, Page No (243-257), July 2010.
[5] T. Xie, S. Thummalapenta, D. Lo and C. Liu, “Data mining for software engineering”, IEEE Computer Society, Volume 42, Issue 08, Page No (55-62), August 2009.
[6] R. H. Thayer, A. Pyster, and R. C. Wood, “Validating solutions to major problems in software engineering project management,” IEEE Computer Society, Page No (65-77), 1982.
[7] C. V. Ramamoorthy, A. Prakash, W. T. Tsai, and Y. Usuda, “Software engineering: problems and perspectives,” IEEE Computer Society, Page No (191-209), October 1984.
[8] J. Clarke et al., “Refomulating software engineer as a search problem,” IEEE Proceeding Software., Volume 150, Issue 03, Page No (161-175), June 2003.
[9] M. Z. Islam and L. Brankovic, “Detective: a decision tree based categorical value clustering and perturbation technique for preserving privacy in data mining,” Third IEEE Conference on Industrial Informatics (INDIN), Page No (701-708), 2005.
[10] M. Aouf, L. Lyanage, and S. Hansen, “Critical review of data mining techniques for gene expression analysis,” International Conference on Information and Automation for Sustainability (ICIAFS) 2008, Page No (367-371), 2008.
[11] P. C. H. Ma and K. C. C. Chan, “An iterative data mining approach for mining overlapping coexpression patterns in noisy gene expression data,” IEEE Trans. NanoBioscience, Volume 08, Issue 03, Page No (252-258), September 2009.
[12] Mendonca, M. and Sunderhaft, N. “Mining software engineering data: a survey”, Data & Analysis Center for Software (DACS) State-of-the-Art Report, No. DACS-SOAR-99-3.
[13] Xie, T., Pei, J. and Hassan, A.E. “Mining software engineering data”, Software Engineering - Companion, 2007. ICSE 2007 Companion. 29th International Conference, Page No (172–173).
[14] Kagdi, H., Collard, M.L. and Maletic, J.I. “A survey and taxonomy of approaches for mining software repositories in the context of software evolution”, Journal of Software Maintenance and Evolution: Research and Practice, Volume 19, Issue 02, Page No (77–131).
[15] C. CHANG and C. CHU, “Software Defect Prediction Using Inter transaction Association Rule Mining”, International Journal of Software Engineering and Knowledge Engineering, Volume 19, Issue 06, Page No (747-764), September 2009.
[16] N. Pannurat, N. Kerdprasop and K. Kerdprasop “Database Reverse Engineering based on Association Rule Mining” , International Journal of Computer Science Issues, Volume 7, Issue 2, Page No (10-15), March 2010
[17] Caiyan Dai and Ling Chen, "An Algorithm for Mining Frequent Closed Itemsets with Density from Data Streams", International Journal of Computer Sciences and Engineering, Volume-04, Issue-02, Page No (40-48), Feb -2016,
[18] S.M.Weiss and C. Kulikowski, “Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems, Morgan Kauffman”, Morgan Kaufmann Publishers Inc, ISBN:1-55860-065-5.
[19] U. M. Fayyad, G. PiateskyShapiro, P. Smuth and R. Uthurusamy, “Advances in Knowledge Discovery and Data Mining”, AAAI Press, ISBN:0-262-56097-6.
[20] M. Halkidia, D. Spinellisb, G. Tsatsaronisc and M. Vazirgiannis, “Data mining in software engineering”, Intelligent Data Analysis 15, Page No (413–441), 2011
[21] M. Berry and G. Linoff, “Data Mining Techniques For marketing, Sales and Customer Support”, John Willey and Sons Inc., ISBN: 978-0-471-17980-1.
[22] K.Selvi, "Identify Heart Diseases Using Data Mining Techniques: an Overview", International Journal of Computer Sciences and Engineering, Volume-03, Issue-11, Page No (180-187), Nov -2015,
[23] L. Kauffman and P.J. Rousseeuw, “Finding Groups in Data: An Introduction to Cluster Analysis”, John Wiley and Sons, ISBN - 9780470317488.
[24] Lovedeep, Varinder Kaur Atri, “Applications of Data Mining Techniques in Software Engineering”, International Journal of Electrical, Electronics and Computer Systems (IJEECS), Volume 02, Issue 05, Page No (70-74), June 2014.
[25] M. Gegick, P. Rotella and T. Xie, “Identifying security bug reports via text mining: an industrial case study”, Mining Software Repositories (MSR), 7th IEEE Working Conference, Page No (11 – 20), 2010.
[26] P. Runeson, and O. Nyholm, “Detection of duplicate defect reports using natural language processing”, Software Engineering, 2007. ICSE 2007. 29th International Conference, Page No (499 – 510), 2007.
[27] Ian Somerville, “Software Engineering”, AddisonWesley, Chapter 30, 4th edition, ISBN - 9783827370013.
[28] J. Estublier, D. Leblang, A. Van Der Hoek, R. Conradi, G. Clemm, W. Tichy and D. WilborgWeber, “Impact of software engineering research on the practice of software configuration management”, ACM Transactions on Software Engineering and Methodology, Volume 14, Issue 04, Page No (383-430), October 2005 .
[29] H.A. Basit and S. Jarzabek, “Data mining approach for detecting higher level clones in software”, IEEE Transactions on Software Engineering, Volume 35, Issue 04, Page No (497 – 514)
[30] Iam Sommerville, “Requirements Engineering A good practice guide”, Ramos Rowel and Kurts Alfeche, John Wiley and Sons, 1997, ISBN – 9780470359396.