Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text
Suparna Arya1 , Amit Majumder2 , Sulabh Majumder3 , Aparajita Kundu4 , Nuzhat Shamim5 , Ira Nath6
Section:Research Paper, Product Type: Journal Paper
Volume-08 ,
Issue-01 , Page no. 1-4, Feb-2020
Online published on Feb 28, 2020
Copyright © Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath, “Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text,” International Journal of Computer Sciences and Engineering, Vol.08, Issue.01, pp.1-4, 2020.
MLA Style Citation: Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath "Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text." International Journal of Computer Sciences and Engineering 08.01 (2020): 1-4.
APA Style Citation: Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath, (2020). Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text. International Journal of Computer Sciences and Engineering, 08(01), 1-4.
BibTex Style Citation:
@article{Arya_2020,
author = {Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath},
title = {Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2020},
volume = {08},
Issue = {01},
month = {2},
year = {2020},
issn = {2347-2693},
pages = {1-4},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1389},
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1389
TI - Supervised Machine Learning approach for Extracting Named Entities from Hindi-English Mixed Social Media Text
T2 - International Journal of Computer Sciences and Engineering
AU - Suparna Arya, Amit Majumder, Sulabh Majumder, Aparajita Kundu, Nuzhat Shamim, Ira Nath
PY - 2020
DA - 2020/02/28
PB - IJCSE, Indore, INDIA
SP - 1-4
IS - 01
VL - 08
SN - 2347-2693
ER -
Abstract
Named Entity Recognition (NER) is a task of identifying named entities from text written in Natural Language. In this task, a string of text in the form of sentence or paragraph is accepted as input and relevant nouns like names of people, places, organizations etc. that are mentioned in that string are identified. This task belongs Information Extraction of the field of Natural Language Processing (NLP). Significant amount of work has been carried out on named entities recognition, but most of the researches have been done for resource-rich languages and domains. It is a challenging task for an informal text and code-mixed text which complicates the process with its unstructured and incomplete information. In this paper, we propose a method of extracting named entities from code-mixed data with different machine learning based algorithms using content and contextual features extracted from code-mixed data.
Key-Words / Index Term
Named Entity, Machine Learning, Support Vector Machine, Decision Tree, K-Nearest Neighbour
References
[1] Kalika Bali, Jatin Sharma, Monojit Choudhury, and Yogarshi Vyas. 2014. “i am borrowing ya mix-ing?” an analysis of english-hindi code mixing in facebook. In Proceedings of the First Workshop on Computational Approaches to Code Switching, pages 116–126.
[2] Yogarshi Vyas, Spandana Gella, Jatin Sharma, Ka-lika Bali, and Monojit Choudhury. 2014. Pos tagging of english-hindi code-mixed social media con-tent. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 974–979.
[3] Arnav Sharma, Sakshi Gupta, Raveesh Motlani, Piyush Bansal, Manish Srivastava, Radhika Mamidi, and Dipti M Sharma. 2016. Shallow parsing pipeline for hindi-english code-mixed social media text. arXiv preprint arXiv:1604.03136.
[4] Sudha Morwal, Nusrat Jahan, and Deepti Chopra. 2012. Named entity recognition using hidden markov model (hmm). International Journal on Natural Language Computing (IJNLC), 1(4):15–23.
[5] Rupal Bhargava, Yashvardhan Sharma, and Shubham Sharma. 2016a. Sentiment analysis for mixed script indic sentences. In Advances in Computing, Com-munications and Informatics (ICACCI), 2016 Inter-national Conference on, pages 524–529. IEEE.
[6] Asif Ekbal and Sivaji Bandyopadhyay. 2008. Bengali named entity recognition using support vector machine. In Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages.
[7] Deepak Gupta, Shubham Tripathi, Asif Ekbal, and Pushpak Bhattacharyya. 2016. A hybrid approach for entity extraction in code-mixed social media data. MONEY, 25:66.
[8] Irshad Ahmad Bhat, Manish Shrivastava, and Riyaz Ahmad Bhat. 2016. Code mixed entity extraction in indian languages using neural networks. In FIRE (Working Notes), pages 296–297.
[9] Vinay Singh, Deepanshu Vijay, Syed S. Akhtar, Manish Shrivastava. Named Entity Recognition for Hindi-English Code-Mixed Social Media Text. In Proceedings of the Seventh Named Entities Workshop, pages 27–35, Melbourne, Australia, July 20, 2018, Association for Computational Linguistics
[10] Alan Ritter, Sam Clark, Mausam, Oren Etzioni; Named Entity Recognition in Tweets: An Experimental Study; in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, July, Year:2011, Address:,Edinburgh, Scotland, UK.