Abstract
Detection of texts from scene images has been an active research area from last couple of decades. The problem of the research becomes challenging due to several environmental clutters such as background complexities, poor resolution, arbitrary orientation of texts, and appearance of texts in multi-lingual scenario. Tesseract is a well-known OCR engine for document-level image analysis. However, to the best of our knowledge, implementation of Tesseract in text detection has not been reported yet. Therefore, this paper presents a fair assessment of the performance of Tesseract in text detection. Reported work is evaluated on multiple benchmark datasets, viz. ICDAR 2013 (born images), ICDAR 2013 (focused scene text), and ICDAR 2019-MLT to validate its performance and efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
T. Khan, R. Sarkar, A.F. Mollah, Deep learning approaches to scene text detection: a comprehensive review. Artif. Intell. Rev. 54, 3239–3298 (2021)
N. Pawar, Z. Shaikh, P. Shinde, Y.P. Warke, Image to text conversion using Tesseract. Int. Res. J. Eng. Technol. 6(2), 516–519 (2019)
S. Long, X. He, C. Yao, Scene text detection and recognition: the deep learning era (2020). arXiv:1811.04256v5
Z. Raisi, M.A. Naiel, P. Fieguth, S. Wardell, J. Zelek, Text detection and recognition in the wild: a review (2020). arXiv:2006.04305v2
C.R. Kulkarni, A.B. Barbadekar, Text detection and recognition: a review. Int. Res. J. Eng. Technol. 4(6), 179–185 (2017)
T. Khan, A.F. Mollah, AUTNT-A component level dataset for text non-text classification and benchmarking with novel script invariant feature descriptors and D-CNN. Multimedia Tools Appl. 78(22), 32159–32186 (2019)
A.F. Mollah, S. Basu, M. Nasipuri, Text detection from camera captured images using a novel fuzzy-based technique, in 3rd International Conference on Emerging Applications of Information Technology (2012), pp. 291–294
T. Khan, A.F. Mollah, A novel text localization scheme for camera captured document images, in 2nd International Conference on Computer Vision and Image Processing, Advances in Intelligent Systems and Computing, vol. 703, pp. 253–264 (2018)
T. Khan, A.F. Mollah, Text non-text classification based on area occupancy of equidistant pixels. Int. Conf. Comput. Intell. Data Sci. Procedia Comput. Sci. 167, 1889–1900 (2020)
A.C. Ozgen, M. Fasounaki, H.K. Ekenel, Text detection in natural and computer-generated images, in 26th Signal Processing and Communications Applications Conference (IEEE, 2018), pp. 1–4
M. Behzadi, R. Safabakhsh, Text detection in natural scenes using fully convolutional DenseNets, in Proceedings of 4th Iranian Conference on Signal Processing and Intelligent Systems (IEEE, 2019), pp. 11–14
Z. Liu, G. Lin, S. Yang, J. Feng, W. Lin, W.L. Goh, Learning Markov clustering networks for scene text detection (2018). arXiv:1805.08365v1
H. Qin, H. Zhang, H. Wang, Y. Yan, M. Zhang, W. Zhao, An algorithm for scene text detection using multi-box and semantic segmentation. Appl. Sci. 9(6), 1054 (2019)
M. Liao, Z. Wan, C. Yao, K. Chen, X. Bai, Real-time scene text detection with differentiable binarization, in 34th Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (2020), pp. 11474–11481
A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D.J. Wu, A.Y. Ng, Text detection and character recognition in scene images with unsupervised feature learning, in ICDAR (IEEE, 2011), pp. 440–445
J.J. Lee, P.H. Lee, S.W. Lee, A. Yuille, C. Koch, Adaboost for text detection in natural scene, in ICDAR (2011), pp. 429–434
W. Huang, Z. Lin, J. Yang, J. Wang, Text localization in natural images using stroke feature transform and text covariance descriptors, in Proceedings of the IEEE International Conference on Computer Vision (2013), pp. 1241–1248
T. Khan, A.F. Mollah, Distance transform-based stroke feature descriptor for text non-text classification, in Recent Developments in Machine Learning and Data Analytics (2019), pp. 189–200
M. Liao, Z. Zhu, B. Shi, G.S. Xia, X. Bai, Rotation-sensitive regression for oriented scene text detection, in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 5909–5918
F. Liu, C. Chen, D. Gu, J. Zheng, FTPN: Scene text detection with feature pyramid based text proposal network. IEEE Access 7, 44219–44228 (2019)
Y. Tang, X. Wu, Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans. Image Process. 26(3), 1509–1520 (2017)
P. He, W. Huang, T. He, Q. Zhu, Y. Qiao, X. Li, Single shot text detector with regional attention, in IEEE International Conference on Computer Vision (2017), pp. 3047–3055
T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
S.V. Rice, F.R. Jenkins, T.A. Nartker, The fourth annual test of OCR accuracy, in Computer Science (1995), pp 1–39
N. Islam, Z. Islam, N. Noor, A survey on optical character recognition system. ITB J. Inf. Commun. Technol. 10(2), 1–4 (2016)
B. Sharma, A.K. Rao, OCR related technology methods. Int. J. Adv. Trends Comput. Sci. Eng. 9(3), 2789–2793 (2020)
K.A. Hamad, M. Kaya, A detailed analysis of optical character recognition technology, in 3rd International Conference on Advanced Technology & Sciences; Int. J. Appl. Math. Electron. Comput. 4(Special Issue), 244–249 (2016)
R. Smith, An overview of the Tesseract OCR engine, in 9th International Conference on Document Analysis and Recognition (2007), pp. 629–633
R. Smith, D. Antonova, D.-S. Lee, Adapting the Tesseract open source OCR engine for multilingual OCR, in International Workshop on Multilingual OCR (2009), pp. 1–8
R. Smith, Hybrid page layout analysis via tab-stop detection, in 10th International Conference on Document Analysis and Recognition (2009), pp. 241–245
D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V.R. Chandrasekhar, S. Lu, F. Shafait, ICDAR 2015 competition on robust reading, in 13th ICDAR (IEEE, 2015), pp. 1156–1160
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Anwar, N., Khan, T., Mollah, A.F. (2022). Text Detection from Scene and Born Images: How Good is Tesseract?. In: Pundir, A.K.S., Yadav, N., Sharma, H., Das, S. (eds) Recent Trends in Communication and Intelligent Systems. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-1324-2_13
Download citation
DOI: https://doi.org/10.1007/978-981-19-1324-2_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1323-5
Online ISBN: 978-981-19-1324-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)