Open Access   Article Go Back

System Implementation and Testing of Proposed Language Independent Stemmer

M. Kasthuri1

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-11 , Page no. 265-273, Nov-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i11.265273

Online published on Nov 30, 2018

Copyright © M. Kasthuri . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: M. Kasthuri, “System Implementation and Testing of Proposed Language Independent Stemmer,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.11, pp.265-273, 2018.

MLA Style Citation: M. Kasthuri "System Implementation and Testing of Proposed Language Independent Stemmer." International Journal of Computer Sciences and Engineering 6.11 (2018): 265-273.

APA Style Citation: M. Kasthuri, (2018). System Implementation and Testing of Proposed Language Independent Stemmer. International Journal of Computer Sciences and Engineering, 6(11), 265-273.

BibTex Style Citation:
@article{Kasthuri_2018,
author = {M. Kasthuri},
title = {System Implementation and Testing of Proposed Language Independent Stemmer},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {11 2018},
volume = {6},
Issue = {11},
month = {11},
year = {2018},
issn = {2347-2693},
pages = {265-273},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3155},
doi = {https://doi.org/10.26438/ijcse/v6i11.265273}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i11.265273}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3155
TI - System Implementation and Testing of Proposed Language Independent Stemmer
T2 - International Journal of Computer Sciences and Engineering
AU - M. Kasthuri
PY - 2018
DA - 2018/11/30
PB - IJCSE, Indore, INDIA
SP - 265-273
IS - 11
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
645 315 downloads 271 downloads
  
  
           

Abstract

Information Retrieval (IR) is an emerging discipline that involves methods, models and patterns to find the documents of an unstructured nature in the dynamic environment. Information Retrieval System (IRS) retrieves the user required information over billions of documents stored in millions of computers. Search Engine is playing a major role in Information Retrieval Systems (IRS) to identify the morphological variants of the language using Stemming. Stemming is an important pre-processing step in query-based systems such as IRS, Web Search Engine, Natural Language Processing (NLP), Big Data Analysis, etc. The purpose of stemming is to diminish different grammatical or word forms to a common base form. In this digital era, most of the web pages are designed using English and European languages. Similarly, the web pages designed with Indian and other Asian languages are also increasing. Search Engines available in Information Retrieval Systems required for dealing with the morphologically different languages in every fraction of a second. The study reveals that the approaches for developing the stemmer involve rule-based, machine learning and hybrid approach. However, each one of them has its own limitations. Therefore, it has been proposed to design the model for Language Independent Stemmer using Dynamic Programming (DP) to retrieve the multi-linguistic web documents with the greater speed and accuracy. However, this research paper presents system implementation and testing of Proposed Language Independent Stemmer (PLIS). The performance of the proposed LIS has been analyzed using a test bed.

Key-Words / Index Term

Information Retrieval, Stemming, EMILLE, Language Independent Stemmer, Dynamic Programming

References

[1] Mohd. Shahid Husain, “An Unsupervised Approach to Develop Stemmer”, In: International Journal on Natural Language Computing (IJNLC), Vol.1, Issue.2, pp.15-23, ISSN: 2278-1307, India, 2012.
[2] Sajjad Ahmad Khan, WaqasAnwarl, UsamaIjaz Bajwa1, and Xuan Wang, “A Light Weight Stemmer for Urdu Language: A Scarce Resourced Language”, In: Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing (SANLP), pp.69–78, India, 2012.
[3] Dhabal Prasad Sethi, “Design of Lightweight Stemmer for Odia Derivational Suffixes”, In: International Journal of Advanced Research in Computer and Communication Engineering, Vol.2, Issue.12, pp. 4594-4597, ISSN (Print): 2319-5940, ISSN (Online): 2278-1021, India, 2013.
[4] Sundar Singh, R K Pateriya, “Enhanced Suffix Stripping Algorithm to Improve Information Retrieval”, In: International Journal of Computer Sciences and Engineering, Vol.3, Issue.8, pp.115-119, E-ISSN: 2347-2693, India, 2015.
[5] Karaa WBA. “A New Stemmer to Improve Information”, In: International Journal of Network Security and Its Applications (IJNSA), Vol.5, Issue.4, pp.143–154, ISSN: 0974-9330, India, 2013.
[6] M. Kasthuri, Dr. S. Britto Ramesh Kumar, “A Framework for Language Independent Stemmer Using Dynamic Programming”, In: International Journal of Applied Engineering Research (IJAER), Vol.10, Number.18, pp.39000-39004, Online ISSN: 1087-1090, Print ISSN: 0973-4562, India, 2015.
[7] M. Kasthuri, Dr. S. Britto Ramesh Kumar, “PLIS: Proposed Language Independent Stemmer for Information Retrieval Systems Using Dynamic Programming”, In: 2016 World Congress on Computing and Communication Technologies, IEEE, pp.132-135, ISBN: 978-1-5090-5573-9, India, 2016.
[8] M.Kasthuri, “Proposed Architecture for Language Independent Stemmer”, In: International Open Access Journal of Emerging Technologies and Innovative Research (JETIR), Vol.5, Issue.10, pp.943-948, ISSN: 2349-5162, October 2018.
[9] NavanathSaharia, KishoriKonwar, M., Utpal Sharma, and JugalKalita, K., “An Improved Stemming Approach Using HMM for a Highly Inflected Language”, In: Springer-Verlag Berlin, ISBN: 9783642372476, Vol.7816, Issue.1, pp.164-173, Heidelberg, 2013.
[10] Garima Joshi, and Kamal Deep Garg, “Enhanced Version of Punjabi Stemmer Using Synset”, In: International Journal of Advanced Research in Computer Science and Software Engineering, Vol.4, Issue.5, pp.1060-1065, ISSN: 2277-128X, India, 2014.
[11] Anshu Sharma, Rakesh Kumar and VibhakarMansotra, “Proposed Stemming Algorithm for Hindi Information Retrieval”, In: International Journal of Innovative Research in Computer and Communication Engineering, Vol.4, Issue.6, pp. 11449-11455, ISSN(Online): 2320-9801, ISSN (Print): 2320-9798, June 2016.
[12] Abebe Belay Adege, YibeltalChanieManie, “Designing a Stemmer for Ge’ez Text Using Rule Based Approach”, In: International Journal of Scientific & Engineering Research, Vol.8, Issue.1, pp.1574-1578, ISSN: 2229-5518, Ethiopia, 2017.
[13] Robert A. N. de Oliveira and Methanias C. Junior “Experimental Analysis of Stemming on Jurisprudential Documents Retrieval”, In: Information, Vol.9, Issue.28, pp.1-34, Brazil, 2018.
[14] AttiaNehar, DjelloulZiadi, and HaddaCherroun, “Rational Kernels for Arabic Stemming and Text Classification”, In: Springer-Verlag Berlin Heidelberg, Vol.1, Issue.1, pp. 176–187, Algerie, 2015.
[15] M. Kasthuri, Dr. S. Britto Ramesh Kumar, “PLIS: Proposed Language Independent Stemmer Performance Evaluation”, In: International Journal of Advanced Research in Computer Science & Technology (IJARCST), Vol.5, Issue.4, pp.943-948, ISSN: 2347-8446 (Online), ISSN: 2347-9817 (Print), India, 2015.
[16] Gunadeep Chetia, Gopal Chandra Hazarika, “Pre-processing Phase of Automatic Text Summarization for the Assamese Language”, In: International Journal of Computer Sciences and Engineering, Vol.6, Issue.10, pp.159-163, E-ISSN: 2347-2693, India, 2018.
[17] Dharmendra Sharma, Suresh Jain, “Evaluation of Stemming and Stop Word Techniques on Text Classification Problem”, In: International Journal of Scientific Research in Computer Science and Engineering, Vol.3, Issue.2, pp.1-4, ISSN: 2320-7639, India, 2015.
[18] John Bosco.P, S.K.V.Jayakumar, “A Study on Web Based Image Search by Re-Ranking Techniques”, In: International Journal of Scientific Research in Network Security and Communication, Vol.5, Issue.3, pp.19-26, ISSN: 2321-3256, India, 2017.