Open Access   Article Go Back

Use of Semantic Search to Enhance the Performance of Plagiarism Detection Tools

A. Das1 , S.Shaw 2

Section:Research Paper, Product Type: Journal Paper
Volume-07 , Issue-01 , Page no. 165-169, Jan-2019

Online published on Jan 20, 2019

Copyright © A. Das, S.Shaw . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: A. Das, S.Shaw, “Use of Semantic Search to Enhance the Performance of Plagiarism Detection Tools,” International Journal of Computer Sciences and Engineering, Vol.07, Issue.01, pp.165-169, 2019.

MLA Style Citation: A. Das, S.Shaw "Use of Semantic Search to Enhance the Performance of Plagiarism Detection Tools." International Journal of Computer Sciences and Engineering 07.01 (2019): 165-169.

APA Style Citation: A. Das, S.Shaw, (2019). Use of Semantic Search to Enhance the Performance of Plagiarism Detection Tools. International Journal of Computer Sciences and Engineering, 07(01), 165-169.

BibTex Style Citation:
@article{Das_2019,
author = {A. Das, S.Shaw},
title = {Use of Semantic Search to Enhance the Performance of Plagiarism Detection Tools},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {1 2019},
volume = {07},
Issue = {01},
month = {1},
year = {2019},
issn = {2347-2693},
pages = {165-169},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=613},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=613
TI - Use of Semantic Search to Enhance the Performance of Plagiarism Detection Tools
T2 - International Journal of Computer Sciences and Engineering
AU - A. Das, S.Shaw
PY - 2019
DA - 2019/01/20
PB - IJCSE, Indore, INDIA
SP - 165-169
IS - 01
VL - 07
SN - 2347-2693
ER -

           

Abstract

Plagiarism is a breach of copyright in the academic world. It has become a serious issue after explosion of digital information as copying has become easier due to huge amount of source but detection has become more difficult. Plagiarism can be of two types: source code plagiarism refers to copying the code from proprietary software and text plagiarism which deals with copying from others text and pretending it as own. There are several tools to detect both type of plagiarism. In this paper we have concentrated mainly on text plagiarism discussed about algorithms used in software available like Turnitin, iThenticate or SafeAssign to detect plagiarism and how NLP techniques and parallel processing can improve them. Mostly all software determine a similarity score for each pair of document and use SCAM (Standard Copy Analysis Mechanism) algorithm to calculate relative measure of overlapping during comparison of common set of words. We have tried to establish how semantic similarity can improve TRUE POSITIVE and TRUE NEGATIVE detection and reduce FALSE POSITIVE and FALSE NEGATIVE detection in our work.

Key-Words / Index Term

Plagiarism, Semantic Similarity, Semantic Search, Turnitin, WordNet, Ontologies, Natural Language Processing

References

[1] Tsatsaronis, George, et al. "Identifying free text plagiarism based on semantic similarity." Proceedings of the 4th International Plagiarism Conference. 2010.
[2] S. Fernando and M. Stevenson, “A semantic similarity approach to paraphrase detection”, Proceedings of the 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics, 2008.
[3] Shenoy, Manjula K., K. C. Shet, and U. Dinesh Acharya. "Semantic plagiarism detection system using ontology mapping." Advanced Computing 3.3 2012, pp 59.
[4] Le Huong T., et al. "Semantic text alignment based on topic modeling." Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2016 IEEE
[5] Alzahrani, Salha and Naomie Salim. "Fuzzy semantic-based string similarity for extrinsic plagiarism detection." Braschler and Harman 1176 , 2010, pp 1-8
[6] Marsi, Erwin, and Emiel Krahmer. "Construction of an aligned monolingual treebank for studying semantic similarity." Language resources and evaluation 48.2, 2014, pp 279-306.
[7] Vrbanec, Tedo, and Ana Meštrović. "The struggle with academic plagiarism: Approaches based on semantic similarity." The 40th Jubilee International ICT Convention–MIPRO 2017.
[8] Zu Eissen, Sven Meyer, and Benno Stein. "Intrinsic plagiarism detection." European Conference on Information Retrieval. Springer, Berlin, Heidelberg, 2006.
[9] Stein, Benno, and Sven Meyer Zu Eissen. "Near similarity search and plagiarism analysis from data and information analysis to knowledge Engineering”. Springer, Berlin, Heidelberg, 2006, pp 430-437.
[10] Meuschke, Norman, and Bela Gipp. "State-of-the-art in detecting academic plagiarism." International Journal for Educational Integrity 9.1, 2013.
[11] Chong, Miranda, Lucia Specia, and Ruslan Mitkov. "Using natural language processing for automatic detection of plagiarism." Proceedings of the 4th International Plagiarism Conference, 2010.
[12] Gharavi, Erfaneh, et al. "A Deep Learning Approach to Persian Plagiarism Detection." FIRE (Working Notes), 2016.
[13] Agirre, Eneko, et al. "Semantic textual similarity, monolingual and cross-lingual evaluation." Proceedings of the 10th International Workshop on Semantic Evaluation, 2016.
[14] Kong, Leilei, et al. "Detecting High Obfuscation Plagiarism: Exploring Multi-Features Fusion via Machine Learning." International Journal of u-and e-Service, Science and Technology, 2014, pp 385-396.
[15] A Das and D Saha, "Improvement of Electronic Governance and Mobile Governance in Multilingual Countries with Digital Etymology Using Sanskrit Grammar," in Social Transformation – Digital Way. CSI 2018, 2018, pp. 523 - 530.
[16] Arijit Das, Tapas Halder, and Diganta Saha, "Automatic extraction of Bengali root verbs using Paninian grammar," 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), pp. 953 - 956.
[17] Arijit Das and Diganta Saha, "Improvement of electronic governance and mobile governance in multilingual countries with digital etymology using sanskrit grammar," 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), pp. 502 - 505.