Keywords

1 Introduction

Binarization in document analysis field is considered as an open challenge, since the historical manuscript images suffer from different kinds of noise and any proposed systems, such as optical character recognition (OCR) and word spotting (WS), need the proper binarized image, while the accuracy of these systems affected directly with this process (binarization). Binarization is the process of extracting the text without any noise (black) and background (white) [1]. With more kinds of noise as; smudge, multi-colored, bleed-through, unclear background, shadow, broken character. The current systems depend on the binarization as the first step which is affected by noise. These systems are included in many applications such as handwritten recognition, watermarking, and data hiding [2].

Thresholding approaches can be either local or global. In the case of degraded images, global approaches do not perform well [3]. Otsu [4], Kapur et al. [5], and Kittler and Illingworth [6] are considered global methods, while Niblack [7], Sauvola and Pietikäinen [8], and Bernsen [9] are considered local methods. From the literature, many different approaches are presented to binarize the degraded images. However, the binarization process is still an open challenge [10].

Recently, meta-heuristic optimization algorithms have a wide range of applications such as feature selection, image processing, and others. Nature-inspired algorithms are well-known optimization algorithms. In these algorithms, the local optima problem can be solved by sharing the information between candidates [11]. Therefore, in this paper, a new cluster algorithm is proposed using one of the recent optimization algorithms named multi-verse optimizer (MVO) [11]. This algorithm is proposed to address the binarization process of historical documents.

The rest of this paper is organized as follows: Section 2 introduces the basics of MVO algorithm. Section 3 presents the proposed approach. In Sect. 4, the experimental result and discussion are clarified. Finally, conclusions and future works are presented in Sect. 5.

2 Preliminaries: Multi-verse Optimizer (MVO)

MVO is a recent nature-inspired algorithm proposed by Mirjalili et al. [11]. It is based on the three concepts of cosmology (white hole, black hole, and wormhole). The exploration phase is based on (white, black hole), while the wormhole is employed for improving the quality in the exploitation phase [11].

At each iteration, these universes are sorted depending on their inflation rate. The roulette wheel is employed for the selecting to have a white hole:

$$\begin{aligned} \mathbf {U} = \begin{bmatrix}y_{1}^{1}&\ldots&y_{1}^{v}\\ \ldots&\ldots&\ldots \\ y_{n}^{1}&\ldots&y_{n}^{v} \end{bmatrix} \end{aligned}$$
(1)

where the number of parameters (variables) is presented by v and the number of universes by n.

$$\begin{aligned} y_{i}^{j} = \left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {y_{k}^{j} } &{} {r1 < NI(Ui)} \\ \end{array} } \\ {\begin{array}{*{20}c} {y_{i}^{j} } &{} {r1 \ge NI(Ui)} \\ \end{array} } \\ \end{array} } \right. \end{aligned}$$
(2)

where \(y_{i}^{j}\) denotes jth of ith universes. Ui presents the ith universe. The normalized inflation rate is presented by NI(Ui) of the ith universe, r1 is a random value in [0, 1], and \(y{_{k}^{j}}\) presents the jth parameter of kth universe chosen by a roulette wheel selection mechanism [11].

To update the solutions, the two parameters Traveling Distance Rate (TDR) and Wormhole Existence Probability (WEP) are calculated based on Eqs. 3 and 4:

$$\begin{aligned} WEP=min +l\times \left( \frac{max-min }{L} \right) \end{aligned}$$
(3)

The minimum and maximum are presented by min (0.2) and max (1) as in Table 1, respectively, while the current iteration presented by l and L denotes the maximum number of iterations:

$$\begin{aligned} TDR=1-\left( \frac{{{l}^{1/p}}}{{{L}^{1/p}}} \right) \end{aligned}$$
(4)

The exploitation accuracy is presented by p. The large value of p indicates high perfect of local search/exploitation. The position of solutions is updated based on Eq. 5:

$$\begin{aligned} y_{i}^{j} = \left\{ {\begin{array}{*{20}l} {\left\{ {\begin{array}{*{20}l} {Y_{j} + {\text {TDR}} \times \left( {\left( {ub_{j} - lb_{j} } \right) \times r4 + lb_{j} } \right) } &{} {r3< 0.5} \\ {Y_{j} - {\text {TDR}} \times \left( {\left( {ub_{j} - lb_{j} } \right) \times r4 + lb_{j} } \right) } &{} {r3 \ge 0.5} \\ \end{array} } \right. } &{} { r2 < {\text {WEP}}} \\ {y_{i}^{j} } &{} {r2 \ge {\text {WEP}}} \\ \end{array} } \right. \end{aligned}$$
(5)

where Yj denotes the jth parameter of the best universe; \( lb_{j}\) and \(ub_{j}\) denote the lower and upper bound of jth variable, while, r2, r3, and r4 are random numbers in [0, 1]. \(y_{j}^{i}\) denotes the jth parameter of ith universe. TDR and WEP are coefficients [11].

3 The Proposed Binarization Approach

Starting with applying the MVO algorithm on the degraded manuscripts image to find the optimal cluster center based on objective function given in Eq. 6 as in the basic K-means clustering algorithm [12]. Depending on the obtained cluster centers to create BW (white, black) representing the foreground by white pixels where the darkest cluster denotes the text. In fact, at every iteration, each (universe) search agent updates its position according to (the best position). Finally, the cluster centers are updated, and the binary image is created. Figure 1 illustrates the general architecture of the proposed binarization approach and its phases.

Fig. 1
figure 1

General architecture of the proposed binarization approach

3.1 Fitness Function and MVO Parameters

Equation 6 is the squared error function that is used as an objective function of the multi-verse optimization algorithm typically as in the k-means clustering [12]:

$$\begin{aligned} J = {{\sum \limits _{j=1}^{k}{\sum \limits _{i=1}^{x}{\left\| {{x}_{i}}^{(j)}-{{c}_{j}} \right\| }^{2}}}} \end{aligned}$$
(6)

The distance measure among the cluster center \({c}_{j}\) and data points \({{{x}_{i}}^{(j)}}\) is presented by \({\left\| {{x}_{i}}^{(j)}-{{c}_{j}} \right\| }^{2}\). It denotes the distance of the n data points from their cluster centers.

The foremost target of MVO is to minimize this function. Each cluster is presented within a single centroid. Each universe presents one solution, and its position is updated according to (best solution). For any optimization algorithm, we primarily require setting some parameters value that provides better performance of the proposed approach. Table 1 refers to the MVO parameters setting.

Table 1 MVO parameter’s setting

4 Experimental Results and Discussion

H-DIBCO 2014 dataset [13] is used and employed to evaluate the proposed approach. This dataset contains ten handwritten images with different kinds of noise which are collected from tranScriptorium project [14]. This dataset is available with its ground truth. This dataset contains illustrative degradations such as bleed-through, faint characters, smudge, and low contrast.

To evaluate the proposed approach, different performance measures [15] are used and employed including F-measure [16, 17], Negative Rate Metric (NRM) [18], Peak Signal-to-Noise Ratio (PSNR) [17], Distance Reciprocal Distortion (DRD) [2], and Misclassification Penalty Metric (MPM) [18, 19]. The high values of F-measure, PSNR, and low value on DRD, NRM, and MPM indicate the best result. In addition, visual inspection is used.

Table 2 presents the results of MVO on H-DIBCO 2014; the high PSNR value appears in H08 with value 24.59, while the worst value is in H07 with value 15.32. The higher value of F-measure (98.09) is in H03, while the worst is in H07 (84.85). In addition, the better DRD value is in H03 (0.92). The best NRM and MPM appear in H03 (0.01, 0.03), respectively.

Table 2 Result of MVO on H-DIBCO 2014

Table 3 summarizes the comparison between the approaches submitted to H-DIBCO 2014 competition [13] and the proposed MVO algorithm. According to Table 3, the numbers (1 to 4) indicate the rank of submitted methods with their values of F-measure, PSNR, and DRD. The result of MVO is better than the well-known methods (Otsu and Sauvola) and the method number (4) in all performance measures.

Table 3 Results of the MVO with the state-of-the-art methods on H-DIBCO 2014
Fig. 2
figure 2

a H08 (H-DIBCO 2014) test sample, b GT, and c MVO result

For visual inspection, two images are selected named H08 and H10 as shown in Figs. 2 and 3. Figures 2 and 3 show the comparison between the ground truth images, as shown in Figs. 2b and 3b, and the MVO output images (Figs. 2c and 3c). From these figures, the output images are very close to the ground truth images with complete character structure, but we found some simple noise.

Fig. 3
figure 3

a H10 (H-DIBCO 2014) test sample, b GT, and c MVO result

The convergence rate is the last judgment measure to evaluate the proposed binarization approach. In each iteration, the solution with the best fitness is kept and it is used to create the convergence curves as in Fig. 4. This figure presents the convergence curve for two different images, and the lower fitness value with increasing the number of iterations demonstrates the convergence of the proposed approach. It is also remarkable that the fitness value decreased dramatically. The optimization problem here is a minimization problem. We can conclude from this figure that the MVO is a promising approach to address the binarization problem.

Fig. 4
figure 4

MVO convergence curve

5 Conclusions and Future Works

This paper presents a binarization approach based on the MVO algorithm, which is employed for minimizing the distance between clusters. The convergence curve rate proves the high speed of MVO algorithm. This approach can deal with various kinds of noise.

As future work, it is planned to use preprocessing phase which can improve the accuracy of binarization. Furthermore, hybridization with other optimization algorithms will be used to improve the results in [20,21,22,23,24]. A comparative analysis between the basic MVO and a chaotic version of it based on different chaos maps and different objective functions will be presented to improve the OCR recognition rate in [25, 26] by using it in the binarization phase.