Automatic Extractive Text Summarization Using K-Means Clustering
M R Prathima1 , H R Divakar2
Section:Research Paper, Product Type: Journal Paper
Volume-6 ,
Issue-6 , Page no. 782-787, Jun-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i6.782787
Online published on Jun 30, 2018
Copyright © M R Prathima, H R Divakar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: M R Prathima, H R Divakar, “Automatic Extractive Text Summarization Using K-Means Clustering,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.6, pp.782-787, 2018.
MLA Style Citation: M R Prathima, H R Divakar "Automatic Extractive Text Summarization Using K-Means Clustering." International Journal of Computer Sciences and Engineering 6.6 (2018): 782-787.
APA Style Citation: M R Prathima, H R Divakar, (2018). Automatic Extractive Text Summarization Using K-Means Clustering. International Journal of Computer Sciences and Engineering, 6(6), 782-787.
BibTex Style Citation:
@article{Prathima_2018,
author = {M R Prathima, H R Divakar},
title = {Automatic Extractive Text Summarization Using K-Means Clustering},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {6 2018},
volume = {6},
Issue = {6},
month = {6},
year = {2018},
issn = {2347-2693},
pages = {782-787},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2255},
doi = {https://doi.org/10.26438/ijcse/v6i6.782787}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i6.782787}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2255
TI - Automatic Extractive Text Summarization Using K-Means Clustering
T2 - International Journal of Computer Sciences and Engineering
AU - M R Prathima, H R Divakar
PY - 2018
DA - 2018/06/30
PB - IJCSE, Indore, INDIA
SP - 782-787
IS - 6
VL - 6
SN - 2347-2693
ER -
VIEWS | XML | |
604 | 516 downloads | 201 downloads |
Abstract
In recent year, data is emerging rapidly in each and every domain such as social media, news, education, etc. Due to data excessiveness, there is a need for an automatic text summarizer which will be having an ability to summarize the data. Since the research importance focusing on Natural Language Processing (NLP), text summarization can be used in several fields. Text summarization is a process of extracting data from a documents and generating summarized text of that documents. Thus presents an important data to the users in a relatively more concise form. The study of various extractive summarization of text is made and an essential text summarization method is proposed on the basis of Support-Vector-Machine (SVM). The proposed model tries to improve the quality as well as performances of the summary generated by the clustering technique by cascading it with Support-Vector-Machine (SVM). The documents are preprocessed to get the tokens that are obtained after tokenization, stop word removal, case folding and stemming. The various similarity measures are utilized in order to identify the similarity between the sentences of the document and then they are grouped in cluster on the basis of their term frequency and inverse document frequency (tf-idf) values of the words.
Key-Words / Index Term
Text Summarization, Extractive Summarization, Natural language Processing (NLP), Clustering, Support-Vector-Machine (SVM), Advanced Encryption Standard (AES), Tokens.
References
[1] Shiva Kumar K M and Soumya R, “Text Summarization using Clustering Technique and SVM Technique”, International Journal of Applied Engineering Research, Vol. 10, No. 12, 2015.
[2] Mgbeafulike IJ nad Christopher, “CONDENZA: A System for Extracting Abstract from a Given Source Document”, Journal of Information Technology and Software Engineering, Vol. 8, Issue 1, 2018.
[3] Babar S, “Text Summarization”, an overview, 2013.
[4] Lehmam A, “Essential summarizer: Innovative automatic text summarization software in twenty languages”, ACM Digital Library, Personalization and Fusion of Heterogeneous Information, 2010.
[5] Ayush Agarwal and Utsav Gupta, “Extraction based approach for text summarization using k-means clustering”, International Journal of Scientific and Research Publications, Vol. 4, Issue 11, Nov 2014.
[6] Simran Kaur and wg.cdr Anil Chopra, “CLUSTERING BASED DOCUMENT SUMMARIZATION”, International Journal of Emerging Trends and Technology in Computer Science, Volume 5, Issue 1, January-February 2016.
[7] Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D.Trippe, Juan B. Gutierrez, and Krys Kochut, “A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques”, August 2017.
[8] Chengzhi ZHANG, Huilin WANG, Yao LIU, Dan WU, Yi LIAO and Bo WANG, “Automatic Keyword Extraction from Documents Using Conditional Random Fields”, Journal of Computational Information Systems 4:3, 2008.
[9] Santosh Kumar Bharti, Korra Sathya Babu, and Anima Pradhan, “Automatic Keyword Extraction for Text Summarization in Multi-document e-Newspapers Articles”, European Journal of Advances in Engineering and Technology, 4(6), 2017.
[10] A. Kogilavani and Dr. P. Balasubramani, “CLUSTERING AND FEATURE SPECIFIC SENTENCE EXTRACTION BASED SUMMARIZATION OF MULTIPLE DOCUMENTS”, International Journal of Computer Science and Information Technology, Vol. 2, No. 4, August 2010.
[11] A. Srinivasa Roa, Dr. Ch. Divakar, Dr. A. Govardhan, “RANK BASED DOCUMENT CLUSTERING AND SUMMARIZATION APPROACH IN THE DISTRIBUTED P2P NETWORK”, Journal of Theoretical and Applied Information Technology, Vol. 78, No. 2, 20th August 2015.
[12] Ayushi Arya, “A Review Paper on Effective AES Implementation”, International Journal of Engineering and Computer Science, Vol. 4, Issue 12, Dec 2015.
[13] Baxendale P. B, “Man-made index for technical literature-an experiment”, IBM Journal of Research and Development, 2(4), 1958.
[14] Edmundson H. P, “New Methods in Automatic Extracting”, Journal of the Association for Computing Machinery, 16(2), April 1969.
[15] Luhn H. P, “The Automatic Creation of Literature Abstracts”, IBM Journal of Research and Development, 2(2), April 1958.
[16] K. Sparck Jones, “A statistical interpretation of term specificity and its application in retrieval”, Journal of Documentation, 28(1), 1972.
[17] G. Salton, Edward Fox and Wu Harry, “Extended Boolean information retrieval”, Communications of the ACM, 26(11), November 1983.
[18] G. Salton and M. J. McGill, “Introduction to modern information retrieval”, McGraw-Hill, 1983.
[19] G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval”, Information Processing and Management, 24(5), 1988.
[20] H. Wu and R. Luk, K. Wong and K. Kwok, “Interpreting TF-IDF term weights as making relevance decisions”, ACM Transactions on Information Systems, 26(3), June 2008.