Open Access   Article Go Back

Rule based Stemmer for Marathi Language

N. Pise1 , V. Gupta2

  1. Computer Science Department, IES, IPS Academy, Indore, Madhya Pradesh, India.
  2. Computer Science Department, Banasthali University, Rajasthan, India.

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-5 , Page no. 500-505, May-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i5.500505

Online published on May 31, 2018

Copyright © N. Pise, V. Gupta . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: N. Pise, V. Gupta, “Rule based Stemmer for Marathi Language,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.5, pp.500-505, 2018.

MLA Style Citation: N. Pise, V. Gupta "Rule based Stemmer for Marathi Language." International Journal of Computer Sciences and Engineering 6.5 (2018): 500-505.

APA Style Citation: N. Pise, V. Gupta, (2018). Rule based Stemmer for Marathi Language. International Journal of Computer Sciences and Engineering, 6(5), 500-505.

BibTex Style Citation:
@article{Pise_2018,
author = {N. Pise, V. Gupta},
title = {Rule based Stemmer for Marathi Language},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2018},
volume = {6},
Issue = {5},
month = {5},
year = {2018},
issn = {2347-2693},
pages = {500-505},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2011},
doi = {https://doi.org/10.26438/ijcse/v6i5.500505}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i5.500505}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2011
TI - Rule based Stemmer for Marathi Language
T2 - International Journal of Computer Sciences and Engineering
AU - N. Pise, V. Gupta
PY - 2018
DA - 2018/05/31
PB - IJCSE, Indore, INDIA
SP - 500-505
IS - 5
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
1593 295 downloads 228 downloads
  
  
           

Abstract

— Natural Language Processing (NLP) is a branch of artificial intelligence which deals with the analysis and synthesis of natural languages in the form of text and speech. NLP requires stemming algorithms to remove derivational and inflectional affixes without performing morphological analysis of the inputs. These algorithms are essential to extract root or stem words. The goal of stemming is to reduce word forms/grammatical forms to their root forms. To accomplish, specific knowledge of language is required. In NLP, the stemmer can be used to improve the efficiency of text summarization, text mining, information retrieval and sentiment analysis. In this paper, we proposed a rule based stemming approach for Marathi language using Marathi corpus, stopword list and suffix stripping rules.

Key-Words / Index Term

natural language processing, stemming, corpus, marathi, suffix stripping and stopwords

References

[1]Ciravegna F, Harabagiu S, “Recent Advances in Natural Language Processing”.IEEE,2013.
[2] Garje, G. V., & Kharate, G. K. “Survey of machine translation systems in India.” International Journal on Natural Language Computing (IJNLC) Vol, 2, 47-67, 2013.
[3] Hovy, E., & Lin,C.Y., “Automated text summarization and the SUMMARIST system”. In Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998. Association for Computational Linguistics, (1998, October).
[4] Lin, C. Y. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop (Vol. 8), (2004, July).
[5] M. Kasthuri and S. B. R. Kumar “A Comprehensive Analyze Of Stemming Algorithms For Indian And Non-Indian Languages” International Journal of Computer Engineering and Applications, Volume VII, Issue III, September 14.
[6] M.Thangarasu., R.Manavalan, “A Literature Review: Stemming Algorithms for Indian Languages”, International Journal of Computer Trends and Technology (IJCTT), volume 4 Issue 8, August 2013.
[7] Mihalcea, R., & Tarau, P.,“TextRank: Bringing order into texts. Association for Computational Linguistics”, (2004, July).
[8] Ms. Anjali Ganesh Jivani, “A Comparative Study of Stemming Algorithms”, International Journal of Computer Technology and Applications, Vol.2 (6), PP 1930-1938, NOV-DEC 2011.
[9] Mudassar, Tanveer J Siddiqui, “Discovering suffixes: A Case Study for Marathi Language”, (IJCSE) International Journal on Computer Science and Engineering, 2010.
[10]Rohit Kansal Vishal Goyal G. S. Lehal, “Rule Based Urdu Stemmer”. Proceedings of COLING 2012: Demonstration Papers, pages 267–276, COLING 2012, Mumbai, December 2012.
[11] Sajjad Ahmad Khan1, Waqas Anwar1, Usama Ijaz Bajwa1, Xuan Wang2, “A Light Weight Stemmer for Urdu Language: A Scarce Resourced Language”, Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing (SANLP),, COLING 2012, Mumbai, December 2012.
[12]Snigdha Paul, Mini Tandon, Nisheeth Joshi and Iti Mathur, “Design of a rule based Hindi Lemmatizer”, pp. 67–74, 2013.
[13]Upendra Mishra, Chandra Prakash, “MAULIK: An Effective Stemmer for Hindi Language”, International Journal on Computer Science and Engineering (IJCSE) Vol. 4 No. 5, PP.711-717, May 2012.
[14]V.Gupta,N.Joshi,I.Mathur,”Design & Development of Rule Based Infectional and Derivational Urdu Stemmer ‘Úsal’” ,INBUSH-ERA-2015,7-12,2015.
[15] V.Gupta,N.Joshi,I.Mathur, “Design & Development of Rule Based Urdu Lemmatizer”,IEEE,2015.
[16]V.Gupta,N.Joshi,I.Mathur, “Rule based stemmer in Urdu”,Computer and Communication Technology(ICCCT) 2013 4th International,2013.
[17]Virat V. Giri, Dr.M.M. Math & Dr.U.P. Kulkarni, “A Survey of Automatic Text Summarization System for Different Regional Language in India”, In Bonfring International Journal of Software Engineering and Soft Computing, Vol. 6, Special Issue, October 2016