Keywords

1 Introduction

Our project aims at an automatic recognition of digits written by hand. The tendency of writing the digits in different forms is much higher as each person writes them differently. In such cases also, the system developed must be able to recognize the digits correctly and give out the accurate results. There exist many classification and pattern recognition algorithms in machine learning such as KNN, SVM, Neural Networks (CNN, ANN, and RNN) [1, 2]. We try to study how each algorithm works while training and testing the dataset. The project focuses on developing an easy methodology of recognizing the images for a plethora of things. The goal of this system is to provide information about the research project “An Efficient Novel approach for detection of handwritten numerical using Machine Learning paradigms” [12]. It includes what we did for the completion of project, requirements, description of modelling of the project, the technologies we have used for the project, implementation of the project and future enhancements [1, 2].

This project explains in detail about how a machine can identify human written digits. For this process, we are considering MNIST (“Modified National Institute of Standards and Technology”) dataset which consist of 60000 images of 28*28 sizes in grey scale[15]. However, considering images as a dataset will not make our work easy. These images need to be pre-processed [2, 4]. As well as many other step need to be performed before entering into training the dataset. We are considering few classification and recognition algorithms like KNN, SVM and CNN [3]. We come up with a conclusion that no algorithm can beat up Neural Network in digit recognizer. We try to achieve the highest accuracy and prediction rates by applying best training methodologies [3, 4].

2 Related Works

The application identification of handwritten numericals using machine learning provides the following uses to the user: this application provides a chance for the user to allow their machine to recognize the digits from 0 to 9 written in any format with ease [2, 3]. It is useful for various kinds of future works like usage of this module in number plate recognition, number detection projects, etc. This also helps in performing multiple further operations like calculating attendance percentage, focusing on absentees, sending text messages to parents, etc. [14]. This document furnishes different diagrams for the “An Efficient Novel approach for detection of handwritten numerical using Machine Learning paradigms” using uml that clearly explains the model of building this particular application [3, 4]. The literary survey provides insights into the project in terms of the user experience and functionality of the project [9].

2.1 Feasibility Study

In the software development process, the feasibility study is critical. It provides the developer an opportunity to assess the operational flexibility of the product that is being developed[13]. The operational flexibility, technical support, the output of the project, and many others are treated as various criteria and parameters to analyse the feasibility of the project. Operational Feasibility, Technical Feasibility, as well as Economic Feasibility are the three basic types of feasibility studies.

2.1.1 Operational Feasibility

The application developed must be in such a way that it is feasible for the users to use with ease. How much the user is willing to use the application determines the operational feasibility of the system. Before checking the operational feasibility, the user must be adequately trained with regard to the application [11]. The user must not feel any difficulty in using the system. However, the percentage of the willingness of the user to use the application entirely depends on the way that the system is built with a user-friendly interface and the method that is taken to popularize the user about the system handling and usage. Our project employs a simple yet intuitive interface for the user to navigate through the application and operate it feasibly.

2.1.2 Technical Feasibility

This study is done to find the technical feasibility. This study determines to find out the technical requirements needed to develop the system.. Our project is developed with utmost care such that it is technically feasible [12]. Changes can be easily made and adopted according to the user's requirements. The trending programming language “Python” and its inbuilt libraries made the application development simple and easy.

2.1.3 Economic Feasibility

This study is done to find the economic feasibility of the developing system. The expenditures must be justified. The existing systems have the economy as one main backdrop because they require other specially developed devices [3, 15]. Thus, our project is designed well with a limited budget, and the hidden reason for this would be freely available technologies. The one and only customized product that has been used as an external device is a webcam which is always an economic-friendly device.

3 System Architecture

See Fig. 1.

Fig. 1.
figure 1

Model framework

4 Methodology

4.1 Data

Handwritten digits were provided for a total of 70,000 photos from the MNIST (Modified National Institute of Standards and Technology) data set, with 60,000 examples in the training set and 10,000 examples in the test set, both with labeled images of 10 digits (0 to 9). This is a small part of the NIST broad set that was normalized to fit a 20 * 20 pixel frame without changing the aspect ratio [17]. The handwritten digits are images in the form of 28 * 28 grayscale intensities of images that represent an image, with the first column of each image being a label (0 to 9). The same has been decided in the case of the test set, which consists of 10,000 photos with labels ranging from 0 to 9 (Figs. 2, 3 and 4).

Fig. 2.
figure 2

Data

Fig. 3.
figure 3

Table of memory allocation in hadoop ecosystem

Fig. 4.
figure 4

Table of memory segregation in hadoop environment for file distribution

4.2 Support Vector Machines (SVM)

The ability of a machine with a reference vector to generate the highest level of precision is greater. SVM can be used for classification, which involves drawing a line between two categories or classes to distinguish [6]. Hyperplanes are the lines that connect different classes [8]. However, this distinction is not so clear [11]. In this situation, the dimension of the hyperplane must be changed from 1D to the N-th dimension, which is called the nucleus. Linear nuclei, polynomial nuclei, and functional nuclei with a radial base are the three types of nuclei [8, 9]. In multidimensional space, the created hyperplane divides different classes [17]. SVM iteratively develops an ideal hyperplane that minimizes the error [7, 13]. However, because it takes a long time to train and performs worse with overlapping classes, this classifier is not suitable for large data sets. It is also sensitive to the type of core used [8, 16].

SVM falls into the category of controlled learning and with a bonus for classification and regression problems. In general, SVM draws an optimal hyperplane, which is classified into different categories [18]. In two-dimensional space, we first draw the data points of the independent variable corresponding to the dependent variables. Then begin the classification process by looking at the hyperplane or any linear or nonlinear plane that distinguishes the two classes at their best (Fig. 5).

figure a
Fig. 5.
figure 5

Support Vector Machines (SVM)

4.3 K Nearest Neighbours (KNN)

This is the most basic classifier for categorizing images. This classifier is effectively explained with a simple expression in plain language: “Tell me your neighbors and I will tell you who you are.” This approach is based solely on the distance between two illuminated vectors [7]. Finds the most common data among the closest K samples to distinguish the new data. Euclidean distance can be used as a distance metric [8].

This algorithm gave an accuracy of 92.8%. However, this algorithm has many significant drawbacks in terms of various aspects such as the choice of features [8], dimensionality reduction, etc. KNN is the nonparametric method or classifier used for both classification and regression problems [17]. This is the delayed or late learning classification algorithm, in which all the calculations are done until the last stage of the classification, and these are instance-based learning algorithms where the convergence is done locally. As it is the simplest and easiest to implement, there is no explicit training phase before and the algorithm does not perform training data aggregation (Fig. 6).

figure b
Fig. 6.
figure 6

K Nearest Neighbours (KNN)

4.4 Stochastic Gradient Descent

Stochastic gradient descent is a very popular and common algorithm that is used in various machine learning algorithms; the most important thing is that it forms the basis of neural networks. Gradient descent is an iterative algorithm that starts from an arbitrary point on a function and moves down its slope in steps until it reaches the lowest point of that function [18] (Figs. 7 and 8).

figure c
Fig. 7.
figure 7

Stochastic Gradient Descent (SGD).

Table: Comparison Analysis

figure d
Fig. 8.
figure 8

Model accuracy

In this section, the processed images are sent as input to various algorithms. The precision and F1 evaluation values are 0.96, 0.95, 0.89 and 0.92, 0,90, 0.81, respectively. Choosing the correct data set, pre-preparing the data with the right techniques, planning the model, and many other tasks combine to create better performance. The CNN model in our model efficiently organizes all the test photos with the names of the individual classes. This did not happen in any case. Each age group has its own data set [18]. Accuracy seems to increase with each generation. The pass set was tested after the model was developed on the preparatory data set and showed a precision of 0.96. Then when the model was obtained with test data as input, the model showed an accuracy of 0.97%.

5 Conclusion

In this section we saw how the different algorithms work in handwriting recognition. There are many classification and recognition algorithms. K-Neighbor and Support Vector Machine are some of the most famous and widely used. All these algorithms consider various factors during training and testing. Also considered is the MNIST handwritten digit dataset used, consisting of approximately 60,000 images with sizes 28 * 28 grayscale images of handwritten digits 0 to 9. All images should be pre-processed using appropriate techniques. In the study, the main characteristics are extracted from each image for further processing for training. The processed images are sent as input to various algorithms. The precision and F1 evaluation values ​​are 0.96, 0.95, 0.89 and 0.92, 0,90, 0.81, respectively. Choosing the correct data set, pre-preparing the data with the right techniques, planning the model, and many other tasks combine to create better performance. The CNN model in our model efficiently organizes all the test photos with the names of the individual classes. This did not happen in any case. Each age group has its own data set. Accuracy seems to increase with each generation. The pass set was tested after the model was developed on the preparatory data set and showed a precision of 0.96. Then when the model was obtained with test data as input, the model showed an accuracy of 0.97%. The excellent performance of the model is a direct result of the inclusion and structure of the authentic image of the model.