A Comparative Study of Segmentation Techniques used in Handwritten Documents
Ms. S. A. Bhopi1 , Mr. M. P. Singh2
- Department of Computer Science, MGM’s College of CS & IT, SRTMU, Nanded, Maharashtra (India).
- Department Of Computer Science, IET, Dr. B. R. Ambedkar University, Agra, U.P. (India).
Section:Research Paper, Product Type: Journal Paper
Volume-6 ,
Issue-4 , Page no. 200-205, Apr-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i4.200205
Online published on Apr 30, 2018
Copyright © Ms. S. A. Bhopi, Mr. M. P. Singh . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Ms. S. A. Bhopi, Mr. M. P. Singh, “A Comparative Study of Segmentation Techniques used in Handwritten Documents,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.4, pp.200-205, 2018.
MLA Style Citation: Ms. S. A. Bhopi, Mr. M. P. Singh "A Comparative Study of Segmentation Techniques used in Handwritten Documents." International Journal of Computer Sciences and Engineering 6.4 (2018): 200-205.
APA Style Citation: Ms. S. A. Bhopi, Mr. M. P. Singh, (2018). A Comparative Study of Segmentation Techniques used in Handwritten Documents. International Journal of Computer Sciences and Engineering, 6(4), 200-205.
BibTex Style Citation:
@article{Bhopi_2018,
author = {Ms. S. A. Bhopi, Mr. M. P. Singh},
title = {A Comparative Study of Segmentation Techniques used in Handwritten Documents},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {4 2018},
volume = {6},
Issue = {4},
month = {4},
year = {2018},
issn = {2347-2693},
pages = {200-205},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=1869},
doi = {https://doi.org/10.26438/ijcse/v6i4.200205}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i4.200205}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=1869
TI - A Comparative Study of Segmentation Techniques used in Handwritten Documents
T2 - International Journal of Computer Sciences and Engineering
AU - Ms. S. A. Bhopi, Mr. M. P. Singh
PY - 2018
DA - 2018/04/30
PB - IJCSE, Indore, INDIA
SP - 200-205
IS - 4
VL - 6
SN - 2347-2693
ER -
VIEWS | XML | |
618 | 384 downloads | 284 downloads |
Abstract
Handwritten document image segmentation is key step for OCR (Optical Character Recognition) System. It is an important step because inaccurately segmented text lines will cause errors in the recognition stage. The selection of segmentation algorithm being used is the essential factor in deciding the accuracy of the OCR system. Devnagari is the most popular script in India. Devnagari is the script for Sanskrit, Hindi, Marathi, Kashmiri, Sindhi, Bihari, Bhili, Konkani, Bhojpuri and Nepali languages. It has vowels, consonants, vowel modifiers and compound characters, numerals. Optical Character Recognition for Devanagari is highly complex due to its rich set of conjuncts. The nature of handwriting makes the process of text line segmentation very challenging. Several techniques to segment handwriting text line have been proposed in the past. Our purpose is to provide a learning-based approach for segmentation of handwritten document images. This paper presents a quantitative comparison of three algorithms for page segmentation: Projection Profile, Run-length Smearing and Bounding Box along with some morphological operations like erosion, dilation etc. We have implemented these algorithms on our own dataset of handwritten documents. We have experimented and compare the accuracy and results of these methods.
Key-Words / Index Term
OCR, Line and Word Segmentation, Projection Profile, Bounding Box, Run length Smearing
References
[1] L. L. Sulem, A. Zahour, B. Taconet, “Text line segmentation of historical documents: a survey”, IJDAR, Vol. 9, No. 2-4, pp. 123-138 , 2007.
[2] S. Nicolas, T. Paquet, L. Heutte, ``Text Line Segmentation in Handwritten Document Using a Production System``, Proceedings of the 9th IWFHR,Tokyo, Japan, pp. 245-250, 2004.
[3] A. Zahour, B. Taconet, P. Mercy, and S. Ramdane, “Arabic Hand-written Text-line Extraction”, in Proceedings of the Sixth International. Conference on Document Analysis and Recognition, ICDAR 2001, Seattle, USA, pp. 281–285, September 10-13 2001.
[4] O.Okun, M. Pietikainen, and J. Sauvola, "Document skew estimation without angle range restriction," IJDAR 2, pp. 132 - 144, 1999.
[5] N. Tripathy and U. Pal. ,“Handwriting Segmentation of Unconstrained Oriya Text,” in International Workshop on Frontiers in Handwriting Recognition, pp. 306–311 , 2004.
[6] Pal U., Datta S. ,” Segmentation of Bangla unconstrained handwritten text”,Proceedings of Seventh International Conference on Document Analysis and Recognition, pp 1128 – 1132,2003.
[7] Arivazhagan, M. ." A statistical approach to
line segmentation in handwritten documents. Document Recognition and Retrieval" XIV, Proceedings of SPIE, San Jose, CA, USA, 6500,2007.
[8] Ha, J., Haralick, R. M., & Phillips, I. T.," Recursive X-Y Cut using Bounding Boxes of Connected Components ", 952–955,1995.
[9] He, S., Samara, P., Burgers, J., & Schomaker, L. "Image-based historical manuscript dating using contour and stroke fragments. Pattern Recognition " , 58.,2016
[10] Le, V. P., Nayef, N., Visani, M., Ogier, J. M., & Tran, C. De.,"Text and non-text segmentation based on connected component features ". In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR (Vol. 2015–November).
[11] Louloudis, G., Gatos, B., Pratikakis, I., & Halatsis, K. (n.d.). "A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents", Department of Informatics , Elsevire Pattern Recognition Tenth International Workshop on Frontiers in Handwriting Recognition, Oct 2006.
[12] Papavassiliou, V., Stafylakis, T., Katsouros, V., & Carayannis, G.."Handwritten document image segmentation into text lines and words ". Pattern Recognition, 43(1), 369–377, 2010.
[13] A.N. Rajath. "An Adaptive Approach : Text Line Extraction from Multi-Skewed Hand Written Documents", 5(6), 158–161,2015.
[14] Yin, F. E. I, & Liu, C.." Handwritten text line extraction based on minimum spanning tree clustering ". International Conference on Wavelet Analysis and Pattern Recognition, 1123–1128,2007.
[15] H. R. Mamatha and k. Srikantamurthy, “Morphological Operations and Projection Profiles based Segmentation of Handwritten Kannada Document”,International Journal of Applied Information Systems (IJAIS)–ISSN:2249-0868 Foundation of Computer Science FCS,2012
[16] Chethana, H. T., & Mamatha, H. R.. “Comparative Study of Text Line Segmentation on Handwritten Kannada Documents”, 7(1), 26–33,2016.
[17] Kinhekar, S..”Comparative Study of Segmentation and Recognition Methods for Handwritten Devnagari Script “, 105(9), 34–39,2014.
[18] Santos, R. P., Clemente, G. S., Ren, T. I., & Calvalcanti, G. D. C..” Text Line Segmentation Based on Morphology and Histogram Projection “,2009 .