A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

Ghoshal, Ranjit; Saha, Aditya; Das, Sayan

doi:10.1007/978-3-319-69900-4_17

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10597))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

2714 Accesses
1 Citations

Abstract

This paper presents a novel variance based image binarization scheme for automatic segmentation of text from low resolution images. First, the variance based binarization scheme is separately carried out on the three color planes of the image. Then, we merge these planes to obtain final binarized image. This creates several connected components (CCs). Now, these CCs are studied in order to segment possible text CCs. Now, a number of features that classify between text and non-text components, are considered. Further, KNN and SVM classifiers are applied for the present two class classification problem. For the training of KNN and SVM, ground-truth information of text CCs and our laboratory made non-text CCs are considered. We conduct extensive experiments on publicly available ICDAR 2011 Born Digital Data set. Concerning comparison, we consider a number of previously reported methods. Our binarization scheme significantly outperforms the existing methods and segmentation results are also satisfactory.

You have full access to this open access chapter, Download conference paper PDF

Text Extraction from Scene Images Through Local Binary Pattern and Business Features Based Color Image Segmentation

A novel method for binarization of scene text images and its application in text identification

Article 14 February 2018

K-NN Based Text Segmentation from Digital Images Using a New Binarization Scheme

1 Introduction

Text in scene images includes important information and is exploited in many content-based video and image applications [1]. Text segmentation is a challenging problem due to variations of font, color, size and orientation etc. Binarization is also a great challenge, especially in the process of text based scene images where binarization result can directly influence the OCR rate. Several methods exist for binarization in document images but they cannot be directly applied on low resolution images. Conventional binarization techniques are either global [2] or local ([3, 4]) thresholding. Existing techniques for scene text segmentation can generally be classified into two sets: sliding window [5] and CC [6] based schemes. Sliding window based schemes use a sliding window to find for possible texts in the scene image and then use machine learning methodologies to identify text. CC based methods separate out character candidates from scene images by CC analysis. Due to their relatively simple implementation, CC-based methods are widely used. Here, we take an interest into color images embedding text. In the following sections, our methods are presented.

2 Variance Based Image Binarization Scheme

Consider a color image which consists of red (R), green (G) and blue (B) planes. Now, it is required to retrieve the information from each plane. Further, the image contains highly varying gray pixel values which make binarization a difficult task. This can be overcome to a great extent using variance to perform binarization.

Each plane is separately passed through the binarization process. First, the variance matrix is calculated from the gray scale image, which marks the change in pixel intensities in the image. Binarizing the variance matrix, we separate the image into two regions, one having high variance values and other having low variance values. Now, using each region, two gray scale images (one from the white and other from the black region of the binarized image) are formed. Binarizing these two images separately will produce more even binarization as they don’t contain any fluctuating pixel intensities. These gray scale images are binarized by a window based Otsu binarization method, as illustrated in the algorithm. Finally, binarized image from each plane is merged together to form the final binarized image. The details have been presented in Algorithm 1.

Consider the RGB image(Fig. 1(a)) as an input. The image has separated into three different planes(Figs. 1(b), (c) and (d)). Consider the R plane, which is passed into the proposed binarization algorithm. Figure 2(a) represents the binarized image of variance matrix, which is calculated by moving a \(5\times 5\) window throughout the image. For white pixels and black pixels, separate gray scale images are again formed and a window based Otsu algorithm is performed. Results are presented in Figs. 3(a) and (b). These are merged to obtain binarized image for R plane(Fig. 3(e)). Similarly, binarized images for G and B planes are presented in Figs. 4(c) and (d). Binarized images from each planes are merged to obtain the final binarized image(Fig. 4(e)).

3 Shape Based Feature Extraction

Image binarization creates a number of CCs. In order to segment text, we have considered a number of features from each CC.

AL:
Axial ratio (AL) of a CC is the ratio of the length of the two axes to each other - the longer axis divided by the shorter.
LO:
Number of lobes in a CC [7].
A:
Aspect ratio of a CC [7].
E:
Elongation ratio of a CC [7].
O:
Object to background pixels ratio of a CC [7].
AR:
Area ratio of a CC. It is the ratio of (area of the CC and area of input image).
L:
Length ratio (L) of a CC. It is the ratio of (max (height of CC, width of CC), max (height of the I, width of the I)), where I is the input image.

Now, we construct the feature vector \({\varvec{Y}} = \{AL, LO, A, E, O, AR, L\}\) for a CC.

4 KNN and SVM Based Text Segmentation

To segment the text components, K-NN and SVM classifiers are applied. The feature vector Y for text and non-text CCs are calculated. The dataset contains 420 train images and 102 test images. Ground truth information of train images are used to create the feature file for 21700 text components. Next, the input images are binarized with our binarization method. Then the components present in the ground truth images are eliminated. Thus we create 78800 non-text CCs. These are used to prepare the feature file for non-text components. Based on these feature files, K-NN and SVM classifiers are trained separately. To segment the text components from test images, an input image is binarized using our binarization method and the feature vector Y is obtained. Now, each CC is fed to both the trained K-NN and SVM classifiers to decide whether the component is text or non-text. Thus, two output images from K-NN and SVM classifiers are obtained. Finally, these two images are merged using logical OR operation to get the final image consisting only text components.

5 Results and Discussion

The experimental results are obtained on ICDAR 2011 Born Digital Dataset [8]. These images are inherently low-resolution. So, automatic segmentation of text is therefore an important project. Our experiments are divided into two parts based on our aim of the paper.

5.1 Results of Binarization Scheme

Let us first pictorially observe some binarization results. A few example results are presented in Table 2. First column represents the sample input images and second column presents the corresponding binarized images. Evaluation of our binarization scheme is done in terms of the precision, recall and F-measure [7]. Also the performance of our binarization scheme has been compared with a few known methods in terms of recall, precision and FM on ICDAR 2011 Born Digital data set. It can be seen from the results (Table 1) that our binarization method has significantly outperformed.

Table 1. Recall, Precision and FM for different binarization technique.

Full size table

Table 2. Input images, binarized images and segmented text (using KNN, SVM and merged of KNN and SVM) are presented respectively \(1^{st}\), \(2^{nd}\), \(3^{rd}\), \(4^{th}\) and \({5^{th}}\) columns.

Full size table

5.2 Text Identification Results

We present the text segmentation results obtained by our KNN and SVM classifiers. A few images and their corresponding segmented text using KNN, SVM and merged KNN and SVM classifier are presented in Table 2. Visually, it is clear that our approach good towards text segmentation. A robust comparison analysis has been performed by means of Recall, Precision and FM values of our different classification methods obtained on the basis of ICDAR 2011 Born Digital data set images are presented in the Table 3. Final evaluation of our scheme is presented by comparing with other known techniques. The ICDAR 2011 Robust Reading Competition presented evaluation results of a number of methods from different participants. In Table 3, a few of these techniques are compared with our scheme. Our scheme has achieved highest recall (77.72).

Table 3. Recall, Precision and FM for different text segmentation methods.

Full size table

6 Summary and Future Scope

This paper provides a new variance based image binarization scheme and its application in text segmentation. A number of shape based features are defined towards segmentation of text. Then, SVM and KNN classifiers are trained for classification of text and non-text. Finally, the results obtain from SVM and KNN are merged to get the final segmented text. The proposed method is very effective for low resolution images. Future study may aim towards combining machine learning tools to improve the binarization scheme.

References

Yin, X.C., Hao, H.W., Sun, J., Naoi, S.: Robust vanishing point detection for mobile cam-based documents. In: Proceedings of ICDAR, pp. 136–140 (2011)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 377–393 (1979)
Article MathSciNet Google Scholar
Sauvola, J., Pietikinen, M.: Adaptive document image binarization. Pattern Recogn. 2, 225–236 (2000)
Article Google Scholar
Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)
Google Scholar
Lee, J.J., Lee, P.H., Lee, S.W., Yuille, A., Koch, C.: Adaboost for text detection in natural scene. In: ICDAR, pp. 429–434 (2011)
Google Scholar
Yi, C., Tian, Y.: Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans. Image Process. 21(9), 4256–4268 (2012)
Article MATH MathSciNet Google Scholar
Ghoshal, R., Roy, A., Parui, S.K.: A copula based statistical model for text extraction from scene images. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 489–494. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45062-4_67
Chapter Google Scholar
Karatzas, D., Robles Mestre, S., Mas, J., Nourbakhsh, F., Roy, P.P.: Icdar 2011 robust reading competition-challenge 1: Reading text in born-digital images (web and email). In: ICDAR, pp. 1485–1490 (2011)
Google Scholar
Bhattacharya, U., Parui, S.K., Mondal, S.: Devanagari and bangla text extraction from natural scene images. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), pp. 171–175 (2009)
Google Scholar
Kumar, D., Ramakrishnan, A.G.: Octymist: otsu-canny minimal spanning tree for born-digital images. In: DAR, DAS 2012, pp. 389–393 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

St. Thomas’ College of Engineering and Technology, Kolkata, India
Ranjit Ghoshal, Aditya Saha & Sayan Das

Authors

Ranjit Ghoshal
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Saha
View author publications
You can also search for this author in PubMed Google Scholar
Sayan Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ranjit Ghoshal .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
B. Uma Shankar
Indian Statistical Institute, Kolkata, India
Kuntal Ghosh
Indian Statistical Institute, Kolkata, India
Deba Prasad Mandal
Indian Statistical Institute, Kolkata, India
Shubhra Sankar Ray
The Hong Kong Polytechnic University, Hong Kong, China
David Zhang
Indian Statistical Institute, Kolkata, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghoshal, R., Saha, A., Das, S. (2017). A Variance Based Image Binarization Scheme and Its Application in Text Segmentation. In: Shankar, B., Ghosh, K., Mandal, D., Ray, S., Zhang, D., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2017. Lecture Notes in Computer Science(), vol 10597. Springer, Cham. https://doi.org/10.1007/978-3-319-69900-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-69900-4_17
Published: 01 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69899-1
Online ISBN: 978-3-319-69900-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

Abstract

Similar content being viewed by others

Text Extraction from Scene Images Through Local Binary Pattern and Business Features Based Color Image Segmentation

A novel method for binarization of scene text images and its application in text identification

K-NN Based Text Segmentation from Digital Images Using a New Binarization Scheme

1 Introduction

2 Variance Based Image Binarization Scheme

3 Shape Based Feature Extraction

4 KNN and SVM Based Text Segmentation

5 Results and Discussion

5.1 Results of Binarization Scheme

5.2 Text Identification Results

6 Summary and Future Scope

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

Abstract

Similar content being viewed by others

Text Extraction from Scene Images Through Local Binary Pattern and Business Features Based Color Image Segmentation

A novel method for binarization of scene text images and its application in text identification

K-NN Based Text Segmentation from Digital Images Using a New Binarization Scheme

1 Introduction

2 Variance Based Image Binarization Scheme

3 Shape Based Feature Extraction

4 KNN and SVM Based Text Segmentation

5 Results and Discussion

5.1 Results of Binarization Scheme

5.2 Text Identification Results

6 Summary and Future Scope

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation