Abstract
Optical character recognition (OCR) is a process to make a document image to an editable text. The whole process contains many phases to reach to the final character recognition. One of most important phases of OCR is Segmentation. Segmentation is a phase where document image is fragmented in individual lines, words, and character. Character recognition rate or accuracy largely depends on correctly applied segmentation. An OCR system requires various kinds of segmentation like page segmentation, line segmentation, word segmentation, and character segmentation. This paper proposes a script-independent projection-based approach for line segmentation in medieval handwritten Devnagari manuscripts. Input document is scanned horizontally pixel by pixel, a histogram for each line is created, and line is segmented according to revised local minima. This revised minima makes this technique suitable for Indian scripts which have modifiers (matras) on above and below the characters. Proposed technique is suited for Indian scripts with modifiers on above and below of the character. This paper addresses the segmentation for medieval Devnagari manuscripts which are ages old. These manuscripts are degraded due to age, insects, weather, etc. To clean this image and make it noise-free, it is a big challenge, and noisy image can produce incorrect segmentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chaudhuri A, Mandaviya K, Badelia P, Ghosh SK (2017) Optical character recognition systems. In: Optical character recognition systems for different languages with soft computing. Studies in fuzziness and soft computing, vol 352. Springer, Cham
Jain AA, Arolkar HA (September, 2018) A survey of gujarati handwritten character. Int J Res Appl Sci Eng Technol (IJRASET) 461–465
Sharma A, Chaturvedi R, Dwivedi UK, Kumar S, Reddy S (2018) Firefly algorithm based effective gray scale image segmentation using multilevel thresholding and entropy function. Int J Pure Appl Math 118(5):437–443
Nicolaou A, Gatos B (July, 2009) Handwritten text line segmentation by shredding text into its lines. In: 2009 10th international conference on document analysis and recognition, pp 626–630. IEEE
Singh AP, Kushwaha AK (2019) Analysis of segmentation methods for Brahmi script. DESIDOC J Libr Inf Technol 39(2):109–116
Pal U, Datta S (August, 2003) Segmentation of Bangla unconstrained handwritten text. Null, p 1128, IEEE
Li Y, Zheng Y, Doermann D, Jaeger S (2008) Script-independent text line segmentation in freestyle handwritten documents. IEEE Trans Pattern Anal Mach Intell 30(8):1313–1329
Jindal S, Lehal GS (December, 2012) Line segmentation of handwritten Gurmukhi manuscripts. In: Proceeding of the workshop on document analysis and recognition, pp 74–78. ACM
Louloudis G, Gatos B, Pratikakis I, Halatsis C (2009) Text line and word segmentation of handwritten documents. Pattern Recogn 42(12):3169–3183
Garg NK, Kaur L, Jindal MK. (April, 2010) A new method for line segmentation of handwritten Hindi text. In: 2010 seventh international conference on information technology: new generations, pp 392–397. IEEE
Palakollu S, Dhir R, Rani R (2012) Handwritten Hindi text segmentation techniques for lines and characters. In: Proceedings of the world congress on engineering and computer science, vol 1, pp 24–26
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mehta, N., Doshi, J. (2021). Text Line Segmentation for Medieval Devnagari Manuscript. In: Purohit, S., Singh Jat, D., Poonia, R., Kumar, S., Hiranwal, S. (eds) Proceedings of International Conference on Communication and Computational Technologies. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-5077-5_37
Download citation
DOI: https://doi.org/10.1007/978-981-15-5077-5_37
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5076-8
Online ISBN: 978-981-15-5077-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)