Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text

Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita kundu, Nuzhat shamim, Ira Nath

Open Access Article Go Back

Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text

Suparna Arya¹ , Amit Majumder² , Sulabh Majumder³ , Aparajita Kundu⁴ , Nuzhat Shamim⁵ , Ira Nath⁶

Section:Research Paper, Product Type: Journal Paper
Volume-08 , Issue-01 , Page no. 1-4, Feb-2020

Online published on Feb 28, 2020

Copyright © Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Citation

IEEE Style Citation: Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath, “Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text,” International Journal of Computer Sciences and Engineering, Vol.08, Issue.01, pp.1-4, 2020.

MLA Citation

MLA Style Citation: Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath "Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text." International Journal of Computer Sciences and Engineering 08.01 (2020): 1-4.

APA Citation

APA Style Citation: Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath, (2020). Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text. International Journal of Computer Sciences and Engineering, 08(01), 1-4.

BibTex Citation

BibTex Style Citation:
@article{Arya_2020,
author = {Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath},
title = {Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2020},
volume = {08},
Issue = {01},
month = {2},
year = {2020},
issn = {2347-2693},
pages = {1-4},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1389},
publisher = {IJCSE, Indore, INDIA},
}

RIS Citation

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1389
TI - Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text
T2 - International Journal of Computer Sciences and Engineering
AU - Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath
PY - 2020
DA - 2020/02/28
PB - IJCSE, Indore, INDIA
SP - 1-4
IS - 01
VL - 08
SN - 2347-2693
ER -

Abstract

Named Entity Recognition (NER) is a task of identifying named entities from text written in Natural Language. In this task, a string of text in the form of sentence or paragraph is accepted as input and relevant nouns like names of people, places, organizations etc. that are mentioned in that string are identified. This task belongs Information Extraction of the field of Natural Language Processing (NLP). Significant amount of work has been carried out on named entities recognition, but most of the researches have been done for resource-rich languages and domains. It is a challenging task for an informal text and code-mixed text which complicates the process with its unstructured and incomplete information. In this paper, we propose a method of extracting named entities from code-mixed data with different machine learning based algorithms using content and contextual features extracted from code-mixed data.

Key-Words / Index Term

Named Entity, Machine Learning, Support Vector Machine, Decision Tree, K-Nearest Neighbour

References

[1] Kalika Bali, Jatin Sharma, Monojit Choudhury, and Yogarshi Vyas. 2014. “i am borrowing ya mix-ing?” an analysis of english-hindi code mixing in facebook. In Proceedings of the First Workshop on Computational Approaches to Code Switching, pages 116–126.
[2] Yogarshi Vyas, Spandana Gella, Jatin Sharma, Ka-lika Bali, and Monojit Choudhury. 2014. Pos tagging of english-hindi code-mixed social media con-tent. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 974–979.
[3] Arnav Sharma, Sakshi Gupta, Raveesh Motlani, Piyush Bansal, Manish Srivastava, Radhika Mamidi, and Dipti M Sharma. 2016. Shallow parsing pipeline for hindi-english code-mixed social media text. arXiv preprint arXiv:1604.03136.
[4] Sudha Morwal, Nusrat Jahan, and Deepti Chopra. 2012. Named entity recognition using hidden markov model (hmm). International Journal on Natural Language Computing (IJNLC), 1(4):15–23.
[5] Rupal Bhargava, Yashvardhan Sharma, and Shubham Sharma. 2016a. Sentiment analysis for mixed script indic sentences. In Advances in Computing, Com-munications and Informatics (ICACCI), 2016 Inter-national Conference on, pages 524–529. IEEE.
[6] Asif Ekbal and Sivaji Bandyopadhyay. 2008. Bengali named entity recognition using support vector machine. In Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages.
[7] Deepak Gupta, Shubham Tripathi, Asif Ekbal, and Pushpak Bhattacharyya. 2016. A hybrid approach for entity extraction in code-mixed social media data. MONEY, 25:66.
[8] Irshad Ahmad Bhat, Manish Shrivastava, and Riyaz Ahmad Bhat. 2016. Code mixed entity extraction in indian languages using neural networks. In FIRE (Working Notes), pages 296–297.
[9] Vinay Singh, Deepanshu Vijay, Syed S. Akhtar, Manish Shrivastava. Named Entity Recognition for Hindi-English Code-Mixed Social Media Text. In Proceedings of the Seventh Named Entities Workshop, pages 27–35, Melbourne, Australia, July 20, 2018, Association for Computational Linguistics
[10] Alan Ritter, Sam Clark, Mausam, Oren Etzioni; Named Entity Recognition in Tweets: An Experimental Study; in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, July, Year:2011, Address:,Edinburgh, Scotland, UK.

Citations	8797
h-index	34
i10-index	152

Impact Factor :	3.802
ISSN :	2347-2693 (Online)