Open Access   Article Go Back

A CSA based Source Code Plagiarism Detection Approach using Sparse Principle Component Analysis

M. Bhavani1 , K. Thammi Reddy2 , P. Suresh Varma3

Section:Survey Paper, Product Type: Journal Paper
Volume-6 , Issue-12 , Page no. 409-417, Dec-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i12.409417

Online published on Dec 31, 2018

Copyright © M. Bhavani, K. Thammi Reddy, P. Suresh Varma . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: M. Bhavani, K. Thammi Reddy, P. Suresh Varma, “A CSA based Source Code Plagiarism Detection Approach using Sparse Principle Component Analysis,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.12, pp.409-417, 2018.

MLA Style Citation: M. Bhavani, K. Thammi Reddy, P. Suresh Varma "A CSA based Source Code Plagiarism Detection Approach using Sparse Principle Component Analysis." International Journal of Computer Sciences and Engineering 6.12 (2018): 409-417.

APA Style Citation: M. Bhavani, K. Thammi Reddy, P. Suresh Varma, (2018). A CSA based Source Code Plagiarism Detection Approach using Sparse Principle Component Analysis. International Journal of Computer Sciences and Engineering, 6(12), 409-417.

BibTex Style Citation:
@article{Bhavani_2018,
author = {M. Bhavani, K. Thammi Reddy, P. Suresh Varma},
title = {A CSA based Source Code Plagiarism Detection Approach using Sparse Principle Component Analysis},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {12 2018},
volume = {6},
Issue = {12},
month = {12},
year = {2018},
issn = {2347-2693},
pages = {409-417},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3353},
doi = {https://doi.org/10.26438/ijcse/v6i12.409417}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i12.409417}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3353
TI - A CSA based Source Code Plagiarism Detection Approach using Sparse Principle Component Analysis
T2 - International Journal of Computer Sciences and Engineering
AU - M. Bhavani, K. Thammi Reddy, P. Suresh Varma
PY - 2018
DA - 2018/12/31
PB - IJCSE, Indore, INDIA
SP - 409-417
IS - 12
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
419 304 downloads 213 downloads
  
  
           

Abstract

Detection of source code plagiarism is valuable for both the academia and industry. Plagiarism is an approach of unlawfully stealing other person source code or program code which is a serious issue for common open source programming and other software companies. Numerous techniques have been introduced priori for automatic detection of source code plagiarism using Evolutionary Intelligent algorithm like Genetic Algorithm (GA), Particle Swarm Optimization (PSO) etc. These techniques are more susceptible to premature convergence and more time consuming. In this paper, considering the benefits of artificial immune system, source code plagiarism approach is proposed that overcomes the drawbacks of previous genetic algorithm and particle swarm optimization algorithms. The sparse PCA is employed for dimensionality reduction prior to detection approach for obtained sparse matrix. Using CSA, the detection between source codes is computed and fitness evaluation is measured using Normalized Euclidean distance (NED) and Normalized Cumulative Reciprocal Rank (NCRR).The performance analysis of the suggested approach showed that it has better precision and recall values when compared with existing Meta heuristic based Source code plagiarism detection algorithms.

Key-Words / Index Term

Source Code detection, Plagiarism approach, Artificial Immune System, Clonal Selection Algorithm, Sparse PCA

References

[1] W. Zhou, Y. Zhou, X. Jiang, and P. Ning, “Detecting repackaged smartphone applications in third-party android marketplaces,” in Proc. 2nd ACM Conf. Data Appl. Security Privacy, 2012, pp. 317–326.
[2] S. Burrows, S. M. M. Tahaghoghi and J. Zobel, “Efficient plagiarism detection for large code repositories”, Software Practice and Experience, vol.37, pp. 151-175, 2006.
[3] G. Cosma and M. Joy, "Towards a Definition of Source-Code Plagiarism," IEEE Transactions on Education, vol. 51, no. 2, pp. 195 - 200, 2008.
[4] Cosma, G., Joy, M., 2008. Towards a definition of source-code plagiarism. IEEE Trans. Edu. 51 (2), 195–200.
[5] Kustanto, C., Liem, I., 2009. Automatic source code plagiarism detection. In: 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing. IEEE, pp. 481–486.
[6] Rabbani, F.S., Karnalim, O., 2017. Detecting source code plagiarism on .NET programming languages using low-level representation and adaptive local alignment. J. Inf. Org. Sci. 41 (1), 105–123.
[7]nLancaster, T., Culwin, F., 2004. A comparison of source code plagiarism detection engines. Comput. Sci. Edu. 14 (2), 101–112.
[8] V. Anjali, T. R. Swapna and B. Jayaraman, "Plagiarism Detection for Java Programs without Source Codes," in The International Conference on Information and Communication Technologies, Kochi, 2014.
[9] A. Cuomo, A. Santone and U. Vilano, "A novel approach based on formal methods for clone detection," in The 6th International Workshop on Software Clones, 2012.
[10] Stamatatos, E. (2011). Plagiarism detection using stop word n_grams. Journal of the American Society for Information Science and Technology, 62(12), 2512-2527.
[11] Stein, B., Lipka, N., &Prettenhofer, P. (2011). Intrinsic plagiarism analysis. Language Resources and Evaluation, 45(1), 63-82.
[12] Alzahrani, S., &Salim, N. (2010). Fuzzy semantic-based string similarity for extrinsic plagiarism detection. Braschlerand Harman.
[13] Gillium, J. (2015). Big data etbibliothèques: traitement etanalyseinformatiques des collections numériques
[14] Banjade, R., Maharjan, N., Gautam, D., &Rus, V. (2016).DTSim at SemEval-2016 Task 1: Semantic Similarity Model Including Multi-Level Alignment and Vector-Based Compositional Semantics. Proceedings of Sem Eval, 640-644.
[15] A. Flint, S. Clegg, and R. Macdonald. Exploring staff perceptions of student plagiarism. Journal of Further and Higher Education, 30:145–156, 2006.
[16] P. Keith-Spiegel, B. G. Tabachnick, B. E. Whitley, and J. Washburn. Why professors ignore cheating: Opinions of a national sample of psychology instructors. Ethics and Behavior, 8(3):215–227, 1998.
[17] W. B. Croft, and D. J. Harper. Using probabilistic models of document retrieval without relevance information. J. Documentation, 35(4): pp. 285-295, 1979
[18] Guyon, J. Weston, S. Barnhill, and V. Vapnik. Gene selection for cancer classification using support vector machines. Machine Learning, 46(1-3):389–422, 2002.
[19] L. De Castro, F. J. Von Zuben, ‘The CSA with engineering applications,’GECCO 2000, Workshop Proceedings, Workshop on Artificial Immune Systems and Their Applications, Las Vegas, USA, 2000, 36-37.
[20] L. N. De Castro, F. J. Von Zuben, ‘Learning and Optimization Using the Clonal Selection Principle,’IEEE Transactions on Evolutionary Computation, Vol. 6, No. 3, June 2002, pp. 239-251.
[21]. Forrest, S. et al.: Using genetic algorithms to explore pattern recognition in the immune system. Evol. Compute. 1, 191–211 (1993).
[22]. Jennifer A. White, Simon M. Garrett: Improved Pattern Recognition with Artificial Clonal Selection? ICARIS 2003, LNCS 2787, Springer-Verlag Berlin Heidelberg 2003. 181-193(2003).
[23] Jinhyun Kim, HyukGeun Choi, Hansang Yun, Byung-Ro Moon, “Measuring Source Code Similarity by Finding Similar Sub graph with an Incremental Genetic Algorithm”, In Proceeding of the Genetic and Evolutionary Computation Conference, pp. 925-932, ACM, 2016.
[24] M.Bhavani, Prof K.Thammi Reddy, Prof P.Suresh Varma, “An Iterative Genetic Algorithm Based Source Code Plagiarism Detection Approach Using NCRR Similarity Measure”, Journal of Theoretical and Applied Information Technology, 2018.
[25] R. Vidal, Y. Ma, and S. Sastry. Generalized principal component analysis (gpca). IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 12, pp. 1945-1959, 2005.
[26] Vanita Jain, Aarushi Jain, Achin Jain, Arun Kumar Dubey, “Comparative Study between FA, ACO, and PSO Algorithms for Optimizing Quadratic Assignment Problem”, International Journal of Scientific Research in Computer Science and Engineering, Vol.6, Issue.2, pp.76-81, 2018.
[27] S.Arora, P. Shukla, N. Karankar, "Community Structure Detection in Social Networking Data Using Text Mining Approach", International Journal of Scientific Research in Computer Science and Engineering, Vol.5, Issue.4, pp.9-15, 2017