Keywords

1 Introduction

Character recognition (CR) is used as an umbrella term in various application domains that cover all types of machine recognition of characters [1, 2]. Character recognition is an art of detecting, segmenting and identifying characters from image. More precisely character recognition is the process of detecting and recognizing characters from input image and converts it into ASCII or other equivalent machine editable form. Conversion of handwritten characters is important for making several important documents related to our history, such as manuscripts, into machine editable form so that it can be easily accessed and preserved.

In this paper feature optimization is carried out by implementing optimization techniques such as differential evolution and particle swarm optimization from the literature. Cell-based directional distribution features are extracted from Telugu characters and these features are used for feature selection step. The organization of this paper is as follows: The related work is discussed in Sect. 2. The feature selection techniques are described in Sect. 3. The experimental results are discussed and compared in Sect. 4. Finally, the conclusions are quoted in Sect. 5.

2 Related Work

Pal et al. [3] developed an approach for Devanagari script containing basic, modified and compound characters. For classification all the characters are separated using the zonal information. Structural feature based binary tree classifier is used to recognize the basic and modified characters and to recognize compound characters a hybrid approach combination of structural characters and run based template is employed.

Sastry et al. developed database for Palm Leaf Character Recognition pertaining to Telugu (a south Indian language) [4, 5]. They extracted depth information from palm manuscripts. This additional feature proved to yield improved recognition accuracy.

Vijaya Lakshmi et al. [6,7,8] worked on isolated Telugu handwritten characters using zoning techniques and hybrid classification approaches. In [6] they reported the local feature extraction approaches yield better recognition of characters, compared to global feature extraction approaches. In [8] they reported that the recognition accuracy can be improved by classifying the characters using two classifiers viz., k-NN and SVM. For various feature extraction methods they reported improved recognition accuracy using two stage classification approach.

Manjunath Aradhya et al. [9, 10] worked on recognizing multilingual south-Indian scripts using Fourier transform and PCA. In [11] they worked on recognizing handwritten digit recognition using radon transform.

3 Proposed Methodology

In the proposed approach 50 basic isolated handwritten characters of Telugu script are considered for the experiments. The characters are collected from 360 different scribers of different profiles written in an isolated manner with different scale, translation and rotation. These documents are scanned at 300 dpi on flatbed scanner. The spacing between samples is enough to segment them by taking the horizontal and the vertical profiles. The character samples are converted into binary images using Otsu’s thresholding technique. All the images are normalized to a size of 50 \( \times \) 50. A total of 18,000 (50 \( \times \) 360) characters are considered in this work. These characters are preprocessed to remove noise and make the images invariant to scale, translation and rotation.

Features are extracted from the preprocessed images by dividing an image into cells. Superimposing 8 directional masks [8] on the each cell the directional information is extracted. For a cell size of \(10 \times 10\), the number of features extracted for a character in 8 directions is 200. For k \(=1\), with k-NN classifier, the Euclidean distance is calculated to find the similarity between the test image and all the training images. The test image which has the minimum distance with a particular training image is considered to be matched with that particular training image. All the experiments are cross validated as there is no proper testing set. In V-fold cross validation, each fold contains characters written by N number of scribers (for V \(= 8\), N is 45). For a 50 class problem each fold contain 50 \( \times \) N character samples. To test a \(t\mathrm{th}\) subset the remaining (V-1) subsets are used for training the classifier. Hence the training and the test sets are disjoint. The average of recognition accuracy (RA) obtained with all the V folds is considered as the RA of the model.

Dimensionality reduction or feature selection is carried out in this work by implementing two optimizing techniques viz., Differential Evolution (DE) [12] and Particle Swarm Optimization (PSO) [13, 14]. These are discussed in the following subsections.

3.1 Differential Evolution

The steps involved in DE are population initialization, mutation, crossover and selection.

3.1.1 Population Initialization

The population \(P_{X,g}\) is initialized randomly for NP patterns with D dimensions. The size of the initial population is \(NP \times D\). Let \(\overrightarrow{X}_{j,g}\) be the \(j\mathrm{th}\) pattern in \(g\mathrm{th}\) generation. The lower limit of the search space is set to ‘1’ and its upper limit is set to ‘total number of features’. Fitness is evaluated for each pattern in the population. In the current work fitness computed is the classification error using k-NN classifier (better fitness for low error rate). The block diagram of DE is shown in Fig. 1.

Fig. 1
figure 1

Block diagram of differential evolution

3.1.2 Mutation

Let the mutant population be \(P_{Y,g}\). The mutant pattern \(\overrightarrow{Y}_{j,g}\) is generated by taking the weighted difference between two random patterns and then adding with a third random pattern from the population, as depicted in Eq. (1).

$$\begin{aligned} \overrightarrow{Y}_{j,g} = \overrightarrow{X}_{s_{0},g} + SF.(\overrightarrow{X}_{s_{1},g} - \overrightarrow{X}_{s_{2},g}) \end{aligned}$$
(1)

where \(s_{0} \), \( s_{1}\) and \(s_{2}\) are three random numbers generated. SF \(\epsilon \) (0,1) is the scale factor which is inversely proportional to the maximum of the random numbers \(s_{1}\) and \(s_{2}\). This allows the second term in Eq. (1) to oscillate within limits without crossing the optimal solutions. In other words it controls the population evolution rate.

3.1.3 Crossover

DE employs uniform crossover in order to build the trial pattern/vector \(\overrightarrow{U}_{j,g}\). It crosses each original pattern with a mutant pattern as given in Eq. (2).

$$\begin{aligned} \overrightarrow{U}_{j,g}= {\left\{ \begin{array}{ll} \overrightarrow{Y}_{j,g}, &{} \text {if}\ s(0,1) \le C_{r} \\ \overrightarrow{X}_{j,g}, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(2)

where \(C_{r} \epsilon \) (0,1) is the crossover rate and s is a random number. The crossover rate controls the fraction of values copied from both the patterns/vectors. In the current work \(C_{r}\) is set to 0.5.

3.1.4 Selection

Roulette wheel is employed to check for redundancy of features. The feature distribution factor \(FD_{i}\) computed, using Eq. (3), is fed to the roulette wheel to decide which feature to choose, to replace the duplicated features.

$$\begin{aligned} FD_{i,g} =\frac{PD_{i}}{PD_{i}+ND_{i}} \end{aligned}$$
(3)

where \(PD_{i}\) and \(ND_{i}\) denotes positive and negative distributions of the feature \(f_{i}\), respectively. Positive distribution is the number of times the feature \(f_{i}\) contributed in forming good subsets (low fitness). Negative distribution is the number of times the feature \(f_{i}\) is used in less competitive subsets (high fitness).

The minimum the fitness error the better is the pattern/individual. Based on the fitness evaluation, DE replaces the original pattern with the trial pattern. In other words it selects the better pattern for the next generation.

All the above steps are repeated until the maximum generations count is reached to find an optimum solution.

3.2 Particle Swarm Optimization

To compare the performance of Differential evolution technique, PSO developed by Eberhart and Kennedy in 1995 is implemented. Let the initial population be \(P_{X,g}\) generated randomly for NP particles with D dimensions. Let the initial position and velocity vectors of the \(j\mathrm{th}\) particle in the swarm be \(\overrightarrow{X}_{j,g}\) and \(\overrightarrow{V}_{j,g}\), respectively, at \(g\mathrm{th}\) generation. The flowchart of PSO is shown in Fig. 2. Fitness computed for all the particles is the classification error using k-NN classifier.

Fig. 2
figure 2

PSO flowchart

For every generation, the velocities of the particles are updated using Eq. (4). The positions of the particles are updated using Eq. (5). The particle’s best value is denoted by \(\overrightarrow{X}_{best}\) and the swarm’s best value is denoted by \(\overrightarrow{G}_{best}\).

$$\begin{aligned} \overrightarrow{V}_{j,g+1} = \omega \times \overrightarrow{V}_{j,g} + c_{1} \times rand1() \times (\overrightarrow{X}_{best}-\overrightarrow{X}_{j,g}) + c_{2} \times rand2() \times (\overrightarrow{G}_{best}-\overrightarrow{X}_{j,g}) \end{aligned}$$
(4)
$$\begin{aligned} \overrightarrow{X}_{j,g+1} = \overrightarrow{X}_{j,g} + \overrightarrow{V}_{j,g+1} \end{aligned}$$
(5)

where rand1() and rand2() are two numbers generated over the range [0 1], \(c_{1}\) and \(c_{2}\) are cognitive and social acceleration constants respectively, and \(\omega \) is a linearly time varying weight given by

$$\begin{aligned} \omega = (\omega _{1}-\omega _{2}) \times \frac{MAXGEN-g}{MAXGEN} \end{aligned}$$
(6)

where \(\omega _{1}\) and \(\omega _{2}\) are the inertia weights, g is the current generation and MAXGEN is the maximum number of generations. In the current work \(c_{1}\) and \(c_{2}\) are set to 2.

The positions and velocities of the particles are updated until the maximum number of generations count is reached to find the optimum subset of features.

4 Experimental Results

The Telugu characters collected from 360 different scribers are used for training and testing in this work. The number of classes considered are 50. In total 18,000 characters are used for both testing and training. The directional information in each cell of a character image are extracted as features from this dataset. For classification k-NN classifier is employed. The recognition accuracy using these features extracted is found to be 85.6 %.

The optimization techniques used in the current work namely, Differential Evolution and Particle Swarm Optimization are allowed to start with the same initial population. The population size is set to 40 and the number of generations is set to 50. However, the feature subset size is varied from 5 to 100, in steps of 5. For each subset, the experiment is made to run 20 times and the average recognition accuracies are shown in Fig. 3.

Fig. 3
figure 3

Comparison of recognition accuracies using feature optimization techniques

Table 1 Comparison with the existing systems on the same dataset

The best recognition accuracy achieved using PSO is 86.7 % for an optimum feature subset size of 90, as shown in Fig. 3, which is in the same band using the original feature set. It is observed from Fig. 3 that even for smaller subset sizes DE performed well compared to PSO. The highest recognition accuracy obtained is 89.1 % using DE algorithm for an optimum subset size of 85 features. This shows an improvement of 3.5 % in recognition accuracy with 85 optimum features compared to using 200 features considered earlier. Hence even with 50 % of the feature subset reduction, the recognition accuracy has increased by 3.5 %. So, more than 50 % of the redundant data is reduced using these optimization techniques.

Vijaya Lakshmi et al. [8] worked on the same dataset and employed the same feature set (cell-based directional information) to recognize Telugu handwritten characters. They presented a hybrid classification approach to improve the recognition rate. The characters misclassified using k-NN classifier in the first stage are again classified using SVM classifier in the second stage. They reported a recognition rate using this hybrid approach for cell-based directional features as 89.3 %. The results obtained with this hybrid method are compared with the optimization techniques presented in the current study and are tabulated in Table 1. Compared to the hybrid approach, with the proposed optimization techniques the computational complexity is reduced.

5 Conclusion

In this work experiments are conducted on Telugu handwritten characters by extracting cell-based directional distribution features. Differential evolution and particle swarm optimization techniques are successfully implemented on these cell-based directional distribution features extracted. By varying the desired number of features, the classification accuracies are obtained and compared using differential evolution and particle swarm optimization techniques. There is an improvement of 3.5 % in recognition accuracy using differential evolution technique when compared to single stage classification system. Even with 50 % of the feature subset reduction the recognition accuracy has increased by 3.5 %. So, more than 50 % of the redundant data is reduced using these optimization techniques.