1 Introduction

Character recognition is an active research zone in the ground of pattern recognition. It automatically converts physical text information (numerals, letters, and symbols) into a corresponding digital format which is machine-readable [59]. There are two categories of character recognition, namely, online or offline. In online character recognition, one writes on an electronic surface such as an electronic tablet with a special pen or digitizer. Characters are particularly captured as a sequence of strokes, speed, and pen up/down data. In these systems, characters are recognized at real-time as soon as it is written [81]. Offline character recognition is the process of translating offline handwritten characters into a format that is understood by machines. Offline character recognition captures the information from a paper document by means of optically or magnetically scanning and hence can be further classified as optical character recognition and magnetic character recognition [57]. Offline character recognition becomes more challenging due to the shape of characters, a great variety of character symbols, document quality, and non-availability of stroke information [81]. Therefore, offline character recognition is a more challenging task than its online counterpart. The classification of character recognition has been depicted in Fig. 1.

Fig. 1
figure 1

Classifications of character recognition

Offline character recognition is significantly different from online character recognition as given in Table 1.

Table 1 Online versus offline character recognition

Recognition may be carried out for both printed and handwritten characters. The major issues of handwritten character recognition include a huge variety in the composition styles such as shape, speed of composing, and thickness of characters. At present, the printed character recognition framework yields more recognition accuracy as compared with handwritten character recognition frameworks. Hence handwritten character recognition has still constrained capacities. Further, handwritten character recognition may be carried out with or without using a segmentation approach. Accurate, robust, and reliable handwritten character recognition by a computer system would be greatly useful for automatic number plate recognition, cheque reading, postcode recognition, signature verification, and as a reading aid for the blind, etc.

The remainder part of the paper is systematized as follows: Section 2 outlines the history, applications, and challenges for Devanagari Handwritten Character Recognition (HCR). Section 3 explores the overview of the Devanagari script. Section 4 presents the motivation for the readers and scholars working in the relevant domain. HCR methodology is presented in Section 5. In Section 6, the literature survey has been presented on feature extraction and classification methods considered for Devanagari HCR along with their comparative study. Research gaps are given in Section 7. Challenges for the present work are presented in Section 8. In Section 9, a few recommendations on future directions of Devanagari character recognition have been discussed. Finally, conclusions are presented in Section 10.

2 Background

To identify the significance of Optical Character Recognition (OCR) methods, in general, it is essential to present background information about the underlying problems, applications, and technical challenges. Methodically, OCR is a sub-component of pattern recognition and it gave the encouragement for developing pattern recognition and image analysis field. A brief history of machine recognition of scripts has been presented in Table 2.

Table 2 History of machine recognition of scripts

2.1 Applications

Nowadays, there is a vast demand for techniques that attempt automatic optical character or script identification or recognition. Some types of techniques that cover various needs of different areas of such applications have been listed in this subsection [22, 104118].

  • Airline ticket readers

  • Automatic license plate recognition

  • Bill processing systems

  • Cheque reading

  • Data classification through learning process

  • Editing old documents

  • Employee code reading/verification

  • Forensic document analysis

  • Form processing

  • Handwritten notes reading

  • Human-robot interaction

  • Library archival.

  • Meaning translation

  • Passport number reading/verification

  • Postcode/pin-code recognition for postal automation

  • Reading aid for blind and visually impaired users

  • Recognition of ancient documents

  • Sign board translation

  • Signature verification

  • Writer verification

Some major challenges have to be put out and handled in order to accomplish effective automation. These challenges have been discussed in the following sub-sections. Various challenges are identified and presented in this work, which will lead interest to the readers for character recognition.

2.2 Challenges for Devanagari HCR

High quality or high-resolution images (with some basic structural properties such as high differentiating text and background) are desired for achieving higher accuracy of the OCR system as it is directly dependent upon the quality of the input image. In order to achieve successful automation in OCR techniques, numerous errors have to be overcome as these often affect the quality of images dramatically [15, 52, 81, 118] and are clarified as below:

Aspect ratio

Text may be short while other texts may be much longer such as traffic signs and video captions respectively. To detect text, a search procedure with respect to location, scale, and length of the text is taken into the account, which introduces high computational complexity.

Blurring and degradation

Character sharpness is required for achieving greater accuracy of character recognition and character segmentation. Uneven focus results due to either a small point of view changes or catching a moving item. It results in blurring and degradation in input images which further reduces the accuracy of an OCR system [71].

Character complexity

Moreover, handwritten Devanagari characters are more complex due to their structure and shape. They include a large character set with more curves, loops, and other details in the characters.

Complex background

Working over a complex background may also be a much greater challenge for the OCR system than working with normal backgrounds.

Different shapes and size of characters

Segmentation and classification become a challenging task for handwritten character recognition due to the different shapes and size of handwritten characters.

Existence of uneven illumination

Capturing images in natural environments may result in uneven lighting and shadows. It may further cause less accurate detection, segmentation, and recognition due to degradation of the desired characteristics of the image that introduces a challenge.

Lack of standard test database

Unfortunately, little standard handwritten character database of Devanagari script is available publicly as a benchmark for experimentation so that the effectiveness of recognition accuracies of various techniques can be compared on a common platform.

Larger character set is due to modifiers

In the Devanagari script, there are upper and lower modifiers due to which two successive lines may overlap with each other. It may result in poor segmentation and hence lower recognition rate.

Low resolution

Recognizing text captured in a photograph or scene text with low resolution remains an unsolved problem in OCR systems until the captured image is preprocessed with suitable preprocessing methods.

Noisy background

Generally, it can be seen that noise gets added to the document/image during the scanning phase. Later, it becomes challenging to remove such background noise while performing digitization or binarization.

Physical and mental state of the writer

Developing a framework for character recognition also poses a challenge to researchers due to the physical and mental state of the writer, writing instrument, pen width, ink color, and many other such factors.

Poor quality of documents

These types of documents usually consist of holes, spots, noise, broken strokes, etc. which may result in the process of line segmentation very challenging.

Scene complexity

Numerous man-made objects such as buildings, painting; appears in a natural environment having similar structural properties and appear as text. It imposes challenges in text recognition in the processed image making it difficult for OCR systems to distinguish text from non-text.

Similar-shaped characters

Another challenge for character recognition is to recognize similar shaped characters or symbols. In Devanagari script, there exists many character pairs such as क-फ and घ-ध that are quite similar in shape.

Skewness

For optical character recognition systems, the skew correction has remained a challenge [19, 65] and various researchers have proposed easier and effective processes to correct the skewness of images such as an OJ method [80] that is suitable for any degree of rotation. Poor results may be observed if a skewed image is inputted directly into the OCR system without applying any suitable preprocessing method.

Speed of writing

Characters can be represented as the trajectory drawn by the pen (up/down) on a writing medium. The nature of characters such as overlapping and touching also depends on the speed of writing which sometimes becomes a challenging problem during character recognition.

Variations of text layout or fonts

Characters in cursive or italic style and script fonts of characters may cause difficulty in segmentation due to their overlapping with each other [111]. It will be difficult to recognize the characters when the class number is large i.e., has large within-class variations and from many pattern sub-spaces.

Various styles of human writing

Every human being has his/her own and different style of writing which may cause difficulty to recognize the characters. Character size, shape, orientation, etc. varies from person to person.

Warping

For OCR systems, warping or elastic deformation of the images could be another challenge where content or characters with varying geometry have to be recognized. Such a situation may arise when an image is captured using handheld cameras. Ulges et al. [112] and Meshesha and Jawahar [71] have found potential for rectification of warped document images called de-warping.

These factors may affect and create problems with incorrect character recognition by a computer system. Thus, there is a need for the method(s) which overcomes these challenges or factors, so that the character recognition framework produces machine-readable correct digital form, automatically. These challenges much are considered during the designing and implementation of the recognition character system to make it more effective.

3 Overview of the Devanagari script

Devanagari belongs to the Brahmic family of scripts of India, Nepal, Tibet, and the South Asian subcontinent [2]. It is adopted by more than 500 million people and is being used for writing numerous languages viz. Hindi, Sanskrit, Marathi, Nepali including similar other languages of the South Asian subcontinent [23, 55]. The Devanagari script consists of 13 vowels, 34 consonants, and 14 modifiers of vowels as depicted in Fig. 2.

Fig. 2
figure 2

Devanagari script (a) Consonants and their corresponding half forms (b) Vowels (c) Modifiers

Moreover, apart from the above, it has compound or composite characters which may be formed by combining two or more basic characters. Compound characters and modifiers can be attached adjacent to each other, on the top side, or the bottom side of the basic character [21]. A vowel followed by a consonant may take a modified shape, depending on whether the vowel is placed to the left, right, top, or bottom of the consonant and are known as modifiers or matras. There is no idea of lower and upper characters and characters, including text and digits are written from left to right. Devanagari script has its own specified composition rules for combining vowels, consonants, and modifiers [13, 46]. An additional feature of Devanagari is the existence of a horizontal line on top of characters called a header line or shirorekha [30, 114]. Two or more characters are joined to form a word by joining header lines of individual characters. A word written in Devanagari script may be divided into strips viz. top, core, and bottom. This header line divides the top and core strips whereas the virtual baseline divides the core and bottom strips. Knowledge of scripting is important in a sense, if a person knows the script of a language, then he/she can easily read the words pertaining to that script on the basis of his/her mental dictionary. Figure 2 represents consonants and their corresponding half forms, vowels, and modifiers of the Devanagari script. Three strips of a word in the Devanagari script are depicted in the following Fig. 3.

Fig. 3
figure 3

Three strips of a word in Devanagari script

4 Motivations for the readers

Research and development of HCR are growing throughout the world in various languages. Many researchers are currently working on the challenges of such systems to make the HCR system more accurate, robust, and reliable. In the present scenario, sufficient research work is available for the recognition of printed text written in non-Indic scripts such as Roman, Chinese, and Japanese. Some research work can also be traced for printed text written in Indic scripts such as Devanagari, Bangla, and Gurumukhi. For example, recently a lot of work has been presented for the Gurmukhi script [24, 25, 50, 62, 63]. The work is still going on and not matured yet to achieve higher recognition accuracy for recognition of documents written in Devanagari script within the optimal time duration. So, identifying a potential need, large application areas, and exciting challenges involved in this field is the key motivation for future researchers working in this area.

In India, many people use the Devanagari script for documentation and there has been a significant improvement in research related to Devanagari HCR systems. Although, researchers suggested different methods for online and offline HCR systems, yet they share a lot of common problems and solutions. Since offline HCR is more complex and hence requires more research compared to online and machine-printed recognition. Various library functions have been developed in MATLAB and OpenCV as a preprocessing phase for the recognition of characters. This paper attempts to address advancements in Devanagari HCR systems, especially their feature extraction and classification methods till 2022. Devanagari HCR has been the subject of intensive study in the last few decades, yet it is still an open subject to achieve the final frontier. Moreover, the study also reveals that there is a great need for efforts towards the progress of multilingual resources.

5 Handwritten character recognition approaches

In general, handwritten character recognition approaches can be broadly divided into two categories: firstly, the traditional approach that uses traditional methods of feature extraction and classification, and secondly, the deep learning approach as depicted in Fig. 4.

Fig. 4
figure 4

Steps for handwritten character recognition

5.1 Image acquisitions or digitization

A handwritten paper-based document is scanned to produce a bitmap image or electronic form called digitization. It yields the digital image that can be fed to the pre-processing phase.

5.2 Pre-processing

It is a preliminary phase that aims to minimize the degradation of the acquired image and produces a normalized bitmap image. Pre-processing involves a number of steps like binarization, skeletonization, dilation of images, detection of edges, noise removal, image enhancement techniques for contrast stretching, thinning and filling, normalization, skew detection and correction [6, 34, 47].

Binarization

It is a process that converts a grayscale image into a binary image (contains only two levels i.e., 0 and 1). Basically, binarization is used for separating foreground pixels from background pixels in an image using the required level of thresholding.

Skeletonization

Foreground regions in a binary image are reduced to a skeletal remnant which is called skeletonization. It preserves the extent as well as connectivity of the original region and removes most of the original foreground pixels. Generally, it is applied to decrease the line width of the text from several pixels to a single pixel.

Detection of edges

It involves the detection of edges or the selection of the outline of an object in the digitized image. Various edge detection operators such as Sobel, Canny, and first and second derivative methods can be applied for the detection of edges.

Erosion and dilation of images

After locating the edges, dilation and erosion operations are used to increase or decrease objects in size to produce the pre-processed image suitable for segmentation. Erosion removes or erodes away the pixels on the image edges and results in a smaller object, whereas dilation produces a larger object by adding pixels around the image edges.

Noise removal

It is carried out to remove those unwanted bits called noise that does not play a substantial role in a document for better processing. The morphological operation, filtering such as median, Gaussian, mean, min-max and wiener, and noise modeling may be applied to remove noise from images.

Thinning and filling

Thinning results improved visibility and structural information of characters in a scanned image by reducing the width while filling eliminates the gap, small breaks, and holes in digitized characters. Thinning removes selected foreground pixels in digitized characters and always extracts the features related to shape information characters [41].

Normalization

Normalization is usually applied to improve the accuracy of OCR systems and it results in uniform character size, rotation, and slant by reducing the shape variation in a scanned document. It gives a remarkable reduction in data size without requiring any of the structure information of the image [66].

Skew detection and correction

Skewness means the tilt or misalignment of the bit-mapped image of the scanned document. It may also happen by humans while writing a document. Skew detection and correction techniques are used to make such types of documents or images in correct alignment. These techniques include analysis of projection profile, Hough transforms, clustering, connected component, and correlation between line techniques.

5.3 Segmentation

In HCR, segmentation plays a significant role and it is used to break the scanned document into paragraphs, lines, words, and characters [93, 109]. Segmentation of handwritten characters is a challenging task due to a variety of writing styles [77]. The accuracy of HCR systems highly depends upon the detection of the best segmentation points for paragraphs, lines, words, and characters [34]. Segmentation is divided into the following parts:

Line segmentation

It is a complex task and the initial stage of the segmentation phase. Researchers developed various techniques for line segmentation and are broadly divided into four groups based on projection profile, Hough-transform, smearing techniques, and thinning operations.

Word segmentation

Word segmentation is used for dividing the handwritten text into words. Most of the existing techniques use a vertical projection profile for this purpose. The white space and pitch method are also used by various researchers to divide a handwritten text line into words.

Zone segmentation

A line of Devanagari text can be divided into 3 horizontal zones, namely, upper or top, middle and lower or bottom. The upper zone and the middle zone are always partitioned by the header line which is named as shirorekha. The upper or top zone signifies the region or area above the headline, while the middle zone signifies the region or the area just below the headline and above the lower or bottom zone. The lower or bottom zone is the lowest part which contains some vowel components as part of vowel modifiers. In Hindi words, upper modifiers and lower modifiers have not always been necessary.

Character segmentation

Character segmentation splits a text region into multiple regions of single characters. Vertical projection profile analysis was an early method for character segmentation. Character segmentation involves extracting the individual characters without including some components of adjoining characters, even though these characters are not touching. Recognition-free and recognition-based segmentation are the methods used for character segmentation. It becomes more complex when characters are touching.

5.4 Feature extraction

Feature extraction plays a major role and an important phase in pattern recognition. The features represent precise information extracted from segmented characters (symbols or words) that distinguishes a particular character from the other characters. The recognition accuracy of an HCR system also depends upon the selection of the feature extraction techniques. Feature extraction can be carried out in a number of ways, however essential is to extract those features that can distinct dissimilar patterns or character classes that exist [113]. Features can be classified into the following major categories:

Statistical features

Characteristics of the distribution of pixel values in the bitmap image are captured as statistical features. These features can be calculated from the statistical distribution of points viz. moments, zoning, histograms, or projection.

Structural features

Structural features depict a pattern in terms of its topology and geometry by giving it local and global properties. These features are mainly based on geometrical properties of a symbol or character viz. loops, directions of strokes, intersections of strokes, and endpoints.

Global transformation-based features

Global transformation techniques such as Fourier transform, discrete cosine transform, wavelet transform, Hough transform, and moments; have the ability to transform the pixel depiction into a corresponding denser form. These techniques generally signify the signal in terms of a linear combination of sequences of simpler well-defined functions. The sequence expansion gives a compact encoding by the coefficient of the linear combination.

Template matching based features

It matches patterns pixels by pixels to identify them. Generally, character recognition in this approach does not need preprocessing such as thinning and pruning [16]. But these approaches are more sensitive to font and size variation of characters. These features are used to recognize compound characters and not suitable for noisy background documents.

Many researchers tried to amalgamate the above features, together, in order to achieve better feature extraction.

5.5 Classification

Classification or recognition is a decision-making phase that uses the features extracted in the earlier phase for deciding the class membership in the pattern recognition system [10]. It compares the input feature with the stored pattern and results in the best suitable matching class for input. It can be performed normally either using a template or feature-based methods.

Template-based method

It involves the direct comparison between an unknown input pattern and an ideal pattern [11, 101]. The amount of correlation between these two patterns is considered for classification or recognition.

Feature-based method

It extracts features from the input pattern and uses these features for classification or recognition models viz. like Artificial Neural Networks [14]; Kompalli et al. [52]; Pal et al., [82]; Jawahar et al., [45] Hidden Markov Models [100], Support Vector Machines [14, 35, 86] Modified Quadratic Discriminant Function [82, 97].

5.6 Deep learning approach

These approaches are proving prominent results and, in some cases, superior to human experts in the previous years. To improve the existing results, researchers are re-experimenting the existing problems by applying deep learning approaches. Researchers have been introduced different architectures of deep learning in recent year’s viz. deep convolutional neural networks, deep belief networks, and recurrent neural networks. Nowadays, researchers are extensively using machine learning approaches for character recognition. The deep learning approaches are basically composed of multiple hidden layers, and each hidden layer consists of multiple neurons, which compute the suitable weight for the deep network.

5.7 Post-processing

This phase is not compulsory, sometimes is used to improve the accuracy of the HCR system. The accuracy of the HCR system can be increased if the output is constrained by a list of words that are permitted to occur in a document. It serves to further refine the results of the classification. Dictionary lookup and statistical approach are commonly used post-processing techniques for error correction [90].

6 Related work

There has been always a great need for research in the area of HCR for Indian languages, even though there are many challenges and a lack of a commercial market [105]. Research towards Indian HCR has gained much attention in recent years, even though basic research on Devanagari recognition was reported in 1977 [95] based on a structural approach. Various feature extraction methods for Devanagari HCR have been proposed in the past few decades and are briefly outlined in the following sub-sections.

6.1 Feature extraction methods

In this section, the feature extraction methods reported by various researchers towards this particular area are presented. Arica and Yarman-Vural [11], calculated both statistical and structural features while Bajaj et al., [17], considered density, moment, and descriptive component features for handwritten Devanagari numeral/character recognition. Elnagar and Harous [33], recognized handwritten Hindi numerals using end, branch, and cross point features based on strokes and cavity information about these features. Features based on Zernike moments and zoning were extracted by Kaur in 2004 for the recognition of the Devanagari script. Kompalli et al. [52] and Kompalli et al. [53], extracted gradient, structural, and concavity (GSC) features for recognition of machine-printed and multi-font Devanagari text. Ramteke and Mehrotra [92] have extracted features based on moment invariants whereas Sharma et al. [97] used directional chain code information of the contour points of the characters as features for recognition. [42, 43] have proposed a box approach for the recognition of handwritten numerals that involve a spatial division of the numeral images into boxes. Moreover, Pal et al. [82] used chain code and gradient-based features for the recognition of Devanagari numerals. The information gained from the arctangent of the gradient and Gaussian filter was used as a feature for HCR in Pal et al. [83]. More and Rege [72], recognized handwritten Devanagari numerals using simple Geometric and Zernike moments. In recognition of handwritten Devanagari word, Shaw et al. [100] used histogram of chain-code directions in the image-strips as a feature vector taking an image-strips scanned from left to right by a sliding window. Kumar [56], carried out the comparative analysis of various feature extraction methods viz. Kirsch directional edges, distance transforms chain code, gradient, and directional distance distribution of the Devanagari handwritten dataset. Moreover, a new feature by quantizing gradient direction into four directional levels where each gradient map is alienated into 4 × 4 regions had been proposed in this paper. Bhattacharya and Chaudhuri [20] have extracted high-level features based on contour representations of all the four frequency components, i.e., high–high, high–low, low–high, and low–low of the wavelet-filtered image was considered for handwritten numeral recognition. To get better results of two similar-shaped handwritten characters, Wakabayashi et al. [115] discussed a feature extraction technique based on the Fisher ratio (F-ratio). Basu et al. [18], carried recognition or classification of handwritten digits using a Quad-Tree-based Longest Run (QTLR) feature. Rajput and Mali [91], have used Fourier Descriptors (FD) as features for recognition of handwritten numerals. Handwritten Devanagari compound characters had been recognized by Arora et al.[14] by calculating shadow and CH features. Aggarwal et al. [3], used the gradient representation for feature extraction. 7200-character samples used were normalized to 90 × 90-pixel sizes. Experimental results using Support Vector Machines (SVM) exhibit high performance with a cross-validation accuracy of 94%. Pratap and Arya [90] have presented a general idea about the Devanagari character recognition system. An efficient character recognition system using Linear Discriminant Analysis (LDA) followed by a Bayesian discriminator function based on the Mahalonob is distance is proposed by Pourmohammad et al. [88]. In this paper, affine transformations were applied to the training samples in the first step to making the scheme robust against scaling and rotation distortion. A unique recognition system that uses wavelet features for classification and recognition was proposed by Dixit et al. [30]. The proposed system gives maximum accuracy of 70% over 2000 samples with 20 letters. Singh and Maring [106], used statistical and structured based feature extraction techniques viz. chain code, zone-based centroid, background directional distribution, and distance profile features for Devanagari HCR. They carried out experiments on more than 20,000 samples with varying image sizes: 30 × 30, 40 × 40, and 50 × 50 and achieved 97.61% overall accuracy with SVM. Statistical techniques that can be used for extracting features for handwritten character recognition were described by Ajmire et al. [5]. Tanuja et al. [110] proposed a system for handwritten Hindi character recognition using canny edge detection, distance transformation, and neural networks with backpropagation algorithms and achieved an accuracy of 95.0%.

Ansari and Sutar [9] have proposed an effective method for the recognition of isolated Marathi handwritten words for the Devanagari script. Gradient, distance transforms, regional and geometric features were computed and used as features of the images representing handwritten words. An overall recognition rate of 94.57%, was achieved for Feed-Forward Neural Network (FFNN) classifiers. The main recognition errors were observed due to abnormal writing and ambiguity among similar shaped words. The challenges involved in Indian postal system automation with a case study had been discussed and also, it throws light on the existing research literature support available for doing the postal automation [116]. A new kind of masking technique was used to extract the features using the Fisher discrimination function from ISIDCHAR (standard Devanagari database) and SVM classifier [67]. Authors improved significantly, recognition rate up to 96.58% in similar character recognition. An approach to feature extraction was proposed for handwritten Marathi characters (a version of Devanagari) using connected pixel-based features like area, perimeter, eccentricity, orientation, and Euler number [48]. They recorded the comparative accuracy of proposed methods and concluded that modified SVM gives high accuracy as compared to the KNN classifier.

Kumar et al. [60] recognized 3D handwritten Devanagari (3750 samples) words using BLSTM-NN classifier and achieved recognition performance of 50%, 68.10% 58.40%, and 63.80% respectively on features namely raw, convex, curvature, writing direction for Devanagari word samples. The authors achieved maximum accuracy using 3D curvature features for the Devanagari script. To overcome the shape similarity problem, Bhattacharya et al. [21] proposed a Sub-stroke-wise Relative Feature (SRF) for the recognition of online Devanagari cursive words. Authors achieved word recognition accuracy of 88.09% on 29,900 words as a dataset. Many researchers proposed several classification methods that utilize features extracted for the character or script identification, as described below. Kumar and Jindal [58] considered various features viz. zoning, diagonal, horizontal peak extent based, intersection and open-end point based along with different classifiers viz. k-NN, Linear-SVM, and MLP for the recognition of multi-lingual characters (English, Hindi, and Punjabi). Authors achieved 92.18%, 84.67%, and 86.79% recognition accuracy for character recognition of English, Hindi, and Punjabi respectively. Narang et al. [76] used statistical features (intersection points, open endpoints, centroid, horizontal peak extent, and vertical peak extent features) and classifiers (CNN, NN, Multilayer Perceptron, RBF-SVM, and random forest techniques) for the recognition of Devanagari ancient manuscripts by considering 6152 samples as database. The authors achieved 88.95% recognition accuracy using a combination of various features and classifiers.

Kumar et al. [64] explored hybrid features for the recognition of offline handwritten characters of Gurumukhi script. They analyzed the performance of their system by combining various features and classifiers along with AdaBoost approach. Authors obtained maximum accuracy of 96.3% on the corpus of 14,000 characters. Abuzaraida et al. [1] developed a system for the recognition of handwritten Arabic words based on structural features. Authors explored KNN classifier and obtained 99.10% of accuracy on the corpus of 2500 words. Kaur and Kumar [51] explored various feature selection approaches for the recognition of handwritten words. Authors achieved 87.42% of recognition accuracy on the corpus of 40,000 handwritten words (Gurumukhi) using Chi-Squared Attribute (CSA) based feature section and Random Forest (RF) classification.

6.2 Classification methods

The character recognition system has another important decision-making step called classification. In this, features are used to decide the class membership of various characters for their recognition. Under this section, the classification methods adopted by various researchers towards this particular area are presented. Connell et al. [23] have achieved 86.5% recognition accuracy with no rejects, by combining multiple classifiers that focus on either local on-line property or global off-line properties of unconstrained Devanagari characters. Kaur [49] has taken the feature vector as the input of feedforward backpropagation NN to classify handwritten Devanagari characters. A quadratic classifier-based method is proposed by Sharma et al. [97]. Pal et al. [83] proposed a modified quadratic classifier for HCR. Arora et al. [12] presented a two-stage classification method for Devanagari HCR. Structural properties, namely shirorekha and spine in a character is extracted in the first stage, whereas intersection features are exploited in the second stage, which is further given to Feed-Forward Neural Network (FFNN) for classification. Hanmandlu et al. [43] have extracted different features like depending upon the availability of the vertical bar, Devanagari characters are classified into three classes viz. end-bar, middle-bar, and characters without any bar based. This coarse classification is performed prior to recognition and handwritten characters are recognized using a modified exponential membership function fitted to the fuzzy sets resulting from the features of the characters. The authors improved the speed of the learning process using a reuse policy. Deshpande et al. [27], introduced the role of Regular Expressions (RE) in Devanagari HCR. In this paper, the authors used chain-code features for translating handwritten characters into the encoded string. Two classifiers, namely, Support Vector Machines (SVM) and Modified Quadratic Discriminant Function (MQDF) were combined to achieve higher accuracy for character recognition [84]. Shaw et al. [99] proposed a segmentation-based method for handwritten Devanagari word recognition. Authors, segmented word images into pseudo-characters on the basis of the header line, and these pseudo-characters were further recognized using HMM. To recognize a handwritten word, continuous density HMM is also presented by Shaw et al. [100].

A dynamic programming-based method is proposed by Pal et al. [85] for recognition of pin code string. An Elastic Matching (EM) method is proposed on the basis of an Eigen Deformation (ED) for Devanagari HCR [72]. This method consists of two phases, namely, training (for ED estimation) and recognition phases. Pal et al. [86] carried out a comparative study of Devanagari HCR using various classifiers, namely, Compound MQDF (CMQDF), compound PD (CPD), Euclidean Distance (ED), k-NN, Linear Discriminant Function (LDF), Mirror Image Learning (MIL), Modified PD (MPD), MQDF, nearest neighbor, PD, Sub-space Method (SM) and SVM. The authors concluded that MIL classifier gives best results, whereas ED provides lowest results among above classifiers. A divide-and-conquer method is implemented for Devanagari HCR [4]. Hanmandlu et al. [44], classified, top modifiers of Devanagari script either as one touching-point or two touching-point modifiers. Further, classification was done by examining the core strip of the word. Devanagari non-compound handwritten characters are classified using two MLPs and a Minimum Edit Distance (MED) method by Arora et al. [14]. In the first phase, two MLPs are used to classify distinctly shaped characters and in a second phase, similarly shaped characters are classified using a MED method. Shelke and Apte [101] recognized Devanagari text using multistage feature extraction and classification methods. Structural features are extracted as an initial step, whereas Radon and Euclidean distance transforms have been carried out as the final step of feature extraction. These features are applied to two separate feed forward backpropagation neural networks. The hybrid classifier at the final stage takes the input from the two neural network classifiers and template matching classifier and results the final output based on a maximum voting rule. This method improves recognition accuracy over individual classifiers considerably as the proposed method achieved a recognition rate of 95.40%. Kubatur et al. [55] have achieved a recognition rate of up to 97.2% using a neural network-based framework for HCR. Kale et al. [47], achieved an overall recognition rate of 98.25% and 98.36% for basic and compound characters respectively, considering a Legendre moment as feature descriptor and Artificial Neural Network (ANN) as a classifier. A novel part-based method is proposed in for recognizing the Devanagari characters by identifying its 40 basic classes [74].

A situation of a very large class recognition problem has been put aside by training models to classify an instance of one of these classes in any given test sample. The proposed approach gives a competitive performance than the results obtained by state-of-the-art features and classifiers for the DSIW2K dataset. Gradient Local Auto-Correlation (GLAC) algorithm is explored for HCR of Devanagari by taking two databases viz. ISIDCHAR and V2DMDCHAR [66]. In this paper, the best results obtained, using the SVM classifier on ISIDCHAR and V2DMDCHAR are 93.21% and 95.21%, respectively. Dongre and Mankar [31] used a multilayer perceptron neural network (MLP-NN) as a classifier on structural and geometric extracted features for recognition of Devanagari numerals and characters. They gained 93.17% and 82.7% recognition accuracies for numerals (using 40 hidden neurons) and characters (60 hidden neurons), respectively. Structural and directional features are extracted individually in each local zone for online HCR of Bengali and Devanagari scripts [37]. Furthermore, these features are given to SVM classifier after concatenation. Authors attained recognition accuracy of 87.48% and 84.10% for Bengali and Devanagari scripts using 4900 and 5000 test samples, respectively. Further, two zone-based feature extraction methods viz. Zone wise Structural and Directional (ZSD) and Zone Wise Slopes of Dominant Points (ZSDP) had been presented for online HCR of Bengali and Devanagari scripts [38]. Furthermore, these features are given to the SVM classifier for stroke recognition, and characters are recognized according to stroke combinations of characters with training data. It is observed that recognition performances for Bengali (9800 test data) and Devanagari (10,000 test data) scripts are 87.48% and 85.10% with ZSD, respectively, whereas, with ZSDP, the accuracies are 92.48% and 90.63%, respectively.

Pagare and Verma [81], implemented a dynamic model based on a Hopfield neural network for auto-associative recognition of Devanagari characters and numerals. Shelke and Apte [102], presented a novel approach based on multi-stage classification for the recognition of unconstrained handwritten Devanagari characters. This classification includes two steps; the first step is based on a fuzzy inference system and the second step is based on structural parameters. The recognition accuracy obtained by this method was 96.95%. Shelke and Apte [103] have presented techniques for optimizing recognition accuracy at various stages, namely, pre-classification, feature extraction, and recognition. Firstly, various structural features were used to classify characters into different classes as pre-classification. After that, features were extracted using optimized feature extraction methods, and finally, a neural network was used for recognition. Performances were analyzed by implementing different neural networks in this paper. Kumar et al. [61] recognized 3D handwritten Latin (2000 samples) and Devanagari (3750 samples) words using multiple Bidirectional Long-Short Term Memory Neural Network (BLSTM-NN) classifiers and Recognizer Output Voting Error Reduction (ROVER) framework. Authors achieved accuracies of 72.25% (for Latin) and 71.86% (for Devanagari) for their lexicon free approach. Mahesh and Sumit [68] proposed a handwritten Devanagari characters recognition system based on Deep Convolutional Neural Networks (DCNN) and adaptive gradient methods. Authors achieved the maximum recognition accuracy of 96.02% and 97.30% on ISIDCHAR database (36,172 characters), 96.45% and 97.65% on V2DMDCHAR database (20,305 characters) and 96.53% and 98.00% on combining both databases (ISIDCHAR+ V2DMDCHAR = 56,477 characters) using DCNN and Layer-wise DCNN respectively with NA-6 and RMSProp optimizer. Further, the authors (2018b) proposed a method for the recognition of handwritten Devanagari characters using gradient-based features and an SVM classifier. They achieved 96.58% recognition accuracy on ISIDCHAR (36,172 characters) dataset in their work. Gupta and Bag [39], in their work, achieved character recognition accuracies of 95.10%, 95.57%, 96.09%, and 94.71% for the Hindi language (3000 words as a database) using random forest, SVM, MLP, and CNN classifiers, respectively. Narang et al. [75] presented a paper to recognize the Devanagari ancient manuscripts using AdaBoost and Bagging techniques. Authors achieved 90.70% (using DCT zigzag features and RBF-SVM classifier) and 91.70% (using adaptive boosting with RBF-SVM) recognition accuracy (maximum) for a database of 5484 samples. Narang et al. [78] carried out the recognition of Devanagari ancient characters (5484 samples) by considering SIFT, Gabor filter-based feature, and SVM-based Classifier. The authors achieved 91.39% recognition accuracy based on the tenfold cross-validation technique and poly-SVM classifier. Devi et al. [28] explored various machine learning algorithms (both supervised and unsupervised) namely Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) for recognition of handwritten characters. Authors concluded that KNN based system results the better recognition accuracy of 98%. Singh et al. [107] explored stroke classification based on RNN classifier for the recognition of Gurmukhi words using a corpus of 52,570 words. Authors achieved maximum of 98.67% accuracy on their corpus of collected words.

6.3 Deep learning based methods

In statistics and machine learning [96], classification algorithms such as Naive Bayes Classifier, Nearest Neighbour, Logistic Regression, Decision Trees, Random Forest, Neural Network, and KNN Classification basically analyze the training database so that classification of testing/target database can be done. Deore and Pravin [26] developed a dataset of a total of 5800 isolated images of 58 unique character classes: 12 vowels, 36 consonants, and 10 numerals. The authors implemented a two-stage VGG16 deep learning model for Devanagari handwritten character recognition. Their models gained 94.84% (First Model) and 96.55% (Second Model) testing accuracy with training loss of 0.18 and 0.12, respectively. Ghosh [36] extracted structural and directional features from publically available signature samples. They explored a deep learning network namely Recurrent Neural Network (RNN). They used two models namely Long-Short Term Memory (LSTM) and Bidirectional Long-Short Term Memory (BLSTM) for the recognition and verification of signatures in offline mode. Authors concluded that their proposed system based on RNN for signature verification performed better as compared with Convolutional Neural Network (CNN) and other state-of-the-art methods in terms of accuracy.

Narang et al. [79] used Convolutional Neural Network (CNN) for the recognition of various ancient manuscripts written in Devanagari script. They explored a deep learning model for the feature extraction and obtained 93.73% of recognition accuracy on the corpus of 5484 characters. Alrobah and Albahli [7] developed a system for the recognition of handwritten Arabic characters based on Conventional Neural Network (CNN) as feature extractor. Authors combined two classifiers namely SVM and eXtreme Gradient Boosting (XGBoost) to improve the recognition accuracy. They achieved 96.3% of recognition rate on the Arabic dataset named as Hijaa. Singh et al. [108] developed a system for handwritten word recognition written in Gurumukhi script based on deep learning approaches. They adopted word-based approach of class labeling i.e. holistic approach so as obtain satisfactory recognition results. Authors achieved 97% of the recognition accuracy for their dataset of Gurmukhi words.

Mushtaq et al. [73] developed a CNN architecture so as to recognize handwritten Urdu characters. Authors obtained 98.82% of recognition accuracy for their corpus of Urdu characters (74, 285 training and 21, 223 testing samples). Korichi et al. [54] devolved a system for the recognition of Arabic handwriting based on Generic Feature-Independent Pyramid Multilevel Model (GFIPML). Authors used AHDB dataset for the performance evaluation of their system and achieved better results. Alrobah and Albahli [8] presented a comprehensive survey for the recognition of Arabic text using various deep learning approaches. Authors have identified some problems, issues and challenges in the recognition of Arabic text. Dey et al. [29] presented a system for the Recognition of Odia characters based on RNN and CNN. Authors have achieved 86.56% of recognition accuracy on their corpus of characters with 112 classes.Elkhayati et al. [32] developed an approach so as to segment Arabic words using Convolutional Neural Network (CNN) and Mathematical Morphology Operations (MMO) for recognition purposes. In their work, authors proposed directed CNN and achieved better results as compared with basic CNN.

Gupta and Bag [40] has compared the performance of segmentation-based and segmentation-free approaches for the recognition of Devanagari conjunct characters based on CNN and transfer learning. Authors used CNN-RNN hybrid architecture so as to minimize the intricacy of classification. They achieved 94.56% (analytic approach), 99.30% (CNN-based holistic approach) and 94.65% (CNN-RNN-based holistic approach) of recognition accuracies for various approaches adopted by them. Prashanth et al. [89] developed a corpus of 38,750 images of Devanagari numerals for recognition purpose. Authors explored various CNN architectures namely CNN, Modifed Lenet CNN (MLCNN) and Alexnet CNN (ACNN) so as to recognize handwritten Devanagari numerals. Authors have achieved significant recognition results. Sachdeva and Mittal [94] developed a system for recognition of handwritten Devanagari compound characters using ResNet model of Convolutional Neural Network (CNN). They explored their in-house corpus for the experimental work and achieved good recognition results. Sharma et al. [98] developed a system for the recognition of Gurumukhi city name based on CNN model. Authors have obtained 99.13% of recognition accuracy on the corpus of 4000 words (city names) by exploring Adam optimizer with CNN model.

6.4 Comparative study

To provide a basic understanding and valuable assistance to the newer researchers in this field, brief summary of hand character recognition of Devanagari script in terms of various parameters are presented in Table 3.

Table 3 Brief summary of handwritten character recognition of Devanagari script

Moreover, in Table 4, a comparative study of recognition results for Devanagari handwritten character recognition in terms of accuracy (%) have been presented using the same features extraction methods with dataset and classification methods considered.

Table 4 Brief summary of handwritten character recognition of Devanagari script (feature wise)

The evaluation of the effectiveness of various methods has not been attempted here as the experiments are not carried out on the same standard dataset/benchmark. However, this study reveals that work done on the handwritten character recognition systems with good accuracy rates in Devanagari scripts is limited and presents a future direction.

7 Research gaps

The following are some research gaps for future research directions in the field of optical Devanagari character recognition:

  1. a.

    Recognition of handwritten mathematical expression is still a challenging area in the field of character recognition.

  2. b.

    Character segmentation may result in additional problems due to overlapping, touching, and broken characters.

  3. c.

    It is desired and challengeable to detect best segmentation points for lines, words, and characters in isolation, to avoid incorrect segmentation as well as incorrect recognition. Segmentation of a handwritten text is a challenging task due to the variety of writing styles of individuals.

  4. d.

    Another challenging task is due to the non-uniform background which may cause poor recognition results.

  5. e.

    Recognition of historical documents is also a challenging problem due to the low quality of documents, availability of non-standard alphabets and unknown fonts, etc.

  6. f.

    Shape similarity of various characters written in Devanagari script such as क-फ, ख-स, घ-ध, थ-य, ब-व, भ-म, and प-ष. The recognition of similar shaped characters is one of the main reasons for misclassification. It is challenging for researchers to identify the small difference (called the critical region) among similar shaped characters that are used by human beings to discriminate them.

  7. g.

    To make the document more attractive, people sometimes use artistic text (nonlinear) such as circular, triangular, curve, arc-form. Existing character recognition systems are unable to recognize artistic text. Hence there is a need to develop conversion models to translate artistic text into simpler linear text so that further character recognition can be carried out successfully in such situations.

  8. h.

    Typically, as the size of class space increases, it becomes more challenging to design a classifier and find out adequate samples of all possible classes for training.

  9. i.

    There is no general approach that is suitable for all kinds of documents such as degraded or historical and various environments.

8 Deep learning based approach and research challenges

Deep learning-based approaches may be applied to various fields of pattern recognition including character recognition [26]. This shall help to solve many complex tasks/steps such as feature extraction and classification of character recognition due to its powerful potential by adjusting the structure and parameters of various deep learning models. Although deep learning-based approaches have great potential to replace other conventional approaches, however, there are still some research challenges [117]:

  1. a.

    It is a challenging job to determine/decide the number of network layers in deep learning-based models and hence, a further number of neurons.

  2. b.

    There is a need for a larger dataset/database as the accuracy depends upon the training samples.

  3. c.

    There are various parameters of the networks of deep learning-based models and hence, determining/decide the optimal parameter is also a research challenge.

  4. d.

    To develop efficient deep learning-based models by reducing/minimizing various parameters viz. memory space, computational calculations, and bandwidth requirement are challenging jobs.

Nowadays, developing a character recognition framework using a deep learning approach is still worth exploring.

9 Suggestions on future directions

In the handwritten character recognition field, numerous directions are possible for future research as existed algorithms used for segmentation, feature extraction, and classification can be extended further for improving recognition accuracy of character recognition systems. The following are some suggestions on future research directions in handwritten character recognition:

  1. a.

    Development of appropriate and effective preprocessing techniques: Recognition accuracy can be improved by developing appropriate and effective preprocessing techniques such as detection and correction of degradation/wrapping, orientation, tilting in text. Further, a suitable technique can be developed for translating the artistic text into linear text so that the accuracy of character recognition systems can be improved.

  2. b.

    Preserve the shape of characters: After binarization or normalization, the characters may change their shape and significant information can be lost. Hence, there is a need to preserve the shape of the character.

  3. c.

    Refinement of segmented characters: Segmented characters may be refined in order to achieve better accuracy rates.

  4. d.

    Adding some more features: The performance of HCR systems can be improved by adding some more features to the existing features.

  5. e.

    Exploring combination of various classifiers: Researchers can combine the merits of one classifier (say Convolutional Neural Network) with another (say Recurrent Neural Network) to handle poor recognition accuracy due to various factors such as complex background and blur/noisy/poor quality documents.

  6. f.

    Use the different optimizers: Researchers may use the various optimizers with a deep learning approach where the deep convolution neural network can be trained with various optimizers to improve their recognition rates.

  7. g.

    Use of multiple classifier architecture: Character recognition results may be improved by combining decisions of different individual classifiers. Based upon the results produced by the individual classifier, the combination can be done according to their architecture such as cascading, parallel and hierarchical.

10 Conclusion

Hindi is the most widely spoken language in India, which is based on the Devanagari script. Devanagari is one of the working scripts for the Hindi language in government offices in India apart from English. In view of that, research on the Devanagari script is focused on in this article so as to serve as a guide and update for readers working in the area of handwritten character recognition. This paper presents a widespread survey on feature extraction and classification methods considered so far for online and offline HCR for Devanagari script, which is essential in OCR research as presented in Tables 3 and 4. However, it is very hard to judge the success of the results of HCR systems in Devanagari script in terms of accuracy (%) as having different constraints, the data set (size), and sample space. There exists no assessment tool to test the performance of individual stages or phases as well as the overall performance of HCR systems. It has also been gathered that there exists always a trade-off between data acquisition quality and complexity of the methods, which limits the accuracy (%). In the past few years, a lot of efforts have been made by various researchers for HCR with Devanagari script; some major improvements have been achieved; however, machines cannot recognize human writing with the same fluency as humans. Moreover, available methods suffer from the lack of characterizing the handwriting generation and the perceptual process of reading, which comprises many complex phenomena.

There is a lack of a standard database on various Indic scripts for experimental work. Devanagari is one of these scripts. In this article, also various challenges are identified which will give a direction for future researchers. Even researchers are exploring the combination of multiple features to achieve good recognition accuracy. At present, there is no complete character recognition system available in India for Devanagari. There is a need to extract the features which will characterize the shape of handwritten characters in an efficient manner as the information lies in its shape and not color, texture, or edge. For text recognition, the traditional machine learning based methods mainly focus on feature extraction however the deep learning based methods mainly focus on the use of deep neural networks for effective learning. Moreover, researchers can adopt a deep learning approach for more general solutions that extract features automatically. Recognition of handwritten compound characters is still in the initial stage and the problem needs to be tackled. Future research will not be directly concerned with character recognition, but also words, phrases, and even complete document recognition.