1 Introduction

Maize corn (known as Zea mays L) crop is one of the most versatile emerging agricultural produces, having wider adaptability under varies agro-climatic conditions. It is the third-largest agricultural produce after the wheat and rice. Maize is the second most popular cereal crop and therefore it is rightly known as “Queen of Cereals”. It contains 79.95% starch, 10.11% protein and 4.19% fat, supplying an energy density of 3365 Kcal/Kg. Maize is mainly cultivated in upland traces during kharif season. The color of maize depicts in between yellow, lemon yellow, and gamboge. The unique feature of maize is its interior diversity. The different maize corn such as grain maize, sweet corn, baby corn and popcorn are mainly cultivated in the India shown as Fig. 1 shows (a) Grain maize (b) Sweet corn (c) Baby corn, and(d) Popcorn. Maize is usually consumed after boiling it in water, fried, or burnt over fire.

The maize crop is subjected to many diseases and attacks by crop eating insects. The Maize leaf diseases mainly results into the decreases in the production and economical loss to the farmers. Artificial intelligence is a major area where machine learning and deep learning are defined in layering structure. Significant applications [2, 9, 29] and solution increase the demand of the AI techniques currently. AI techniques are explored in the successive results of health sector like malaria cell detection [1]. Due to the technological advancement over the last few years, the problem of plant disease analysis and severity approximation has attracted a lot of attention by the researchers and practitioners [4]. This problem is related to agriculture sector, it has a very significant role to play in the growth of Indian economy. It contributes about 17% to the total GDP and provides employment to over 60% of the population [31]. The relevance of the problem of maize diseases to the societal context of the country, it motivates further research in the field of plant disease identification and severity analysis.

The plant disease identification and classification [22] employs the use of advance technology and digital tools for the detection and classification of plant diseases, as the traditional methods (where manual monitoring is done and farmers have limited facilitation, exposure and technological support) under practice are not efficient [32]. This raises the urgent requirement to investigate computer assisted methods and classification models which will recognize the plants suffering from diseases and healthy plants at early stage, so that remedial treatment can be applied in timely manner. Literature indicates that image processing has been used in diverse agriculture diagnosis applications for the timely detection of plant diseases and acknowledgement of plant fitness. The disease-diagnosis [26] is a challenging task depending on its symptoms such as color spots or rust, seen on the leaves of a plant. In maize plants, Cercospora and Gray leaf spot diseases are shown in Fig. 2 and Common rust based diseases are demonstrated in Fig. 3. Whereas, the healthy leaf of maize crop is presented in Fig. 4. The diseases in the form of their visual symptoms could be traced down to the leaves or stem of a plant [14].

Fig. 1
figure 1

(a) Grain maize (b) Sweet corn (c) Baby corn, and (d) Popcorn

Fig. 2
figure 2

Cercospora leaf spot and Gray leaf spot of maize crop

Fig. 3
figure 3

Common Rust of maize crop

Fig. 4
figure 4

Healthy leaves of Maize Crop.

The prevention of the maize plant from the disease can be done by using computer vision techniques [25]. Recently, use of Deep Convolution neural networks [21] have witnessed new development in the area of Machine Learning (ML). Machine Learning methods exhibit a very strong ability of extracting feature and classification of object-detection [16] and other related fields [17, 18]. It can extract features of images at layer by layer from pixel-level. The recent research has documented and demonstrated the development in the area of automatic digital computer vision such as Brain and breast cancer detection [30], automatic detection motor vehicle without helmet [5], face mask detection [24] etc. During the literature study, various deep learning model [8, 7, 10, 27] are implemented for the getting appropriate solutions. Artificial Neural network [13] was designed with the use of genetic algorithm for weight evolution purpose. The objective of this research is automatic defect classification of cherry image where each were classified into seven categories. Leaf Vain Pattern based plant identification model was developed by [12] using Convolution Neural Network (CNN). Automatic feature extraction and classification [28] of tomato plant leaf disease using CNN had been developed with the use of Learning Vector Quantization (LVQ) algorithm. Multi-class based plant disease classification [23] has targeted by deep convolution neural network where 96.02% accuracy is achieved with 32 epochs. The ResNet [6] has been implemented for plant seedling image classification with the objective of plant farming.

2 Proposed AlexNet network

AlexNet Model [11] is an implementation of 8-layer CNN which targeted successfully ImageNet Large Scale Visual Recognition Challenge 2012. The layering structure of AlexNet aspects like LeNet network, but AlexNet has been advanced and detailed with more convolutional layers and advanced parameter space. The basic objective of Alexnet is that the features obtained by learning can be surpassed by hand crafted features, breaching the previous model in visual object. In the first part, the input images of 224 × 224 × 3 are filtered by the first convolution layer with 96 kernels of size 3 × 3 × 3 with stride of 1pixel with max pooling filter size 2 with kernel size 96. Then, the second convolution layer takes the feature maps from the first layer and convolved with 96 filters of size 3 × 3 × 256. The second part also contains two convolution layers. The pooled feature maps from the first part are to feed to the second convolution layers of this part. In third part, we have used two convolution layers with filter size 3 × 3 × 3 and kernel size are 384. In the fourth part, one convolution layer and one max pooling layer is applied with the kernel size 256 and filter size 3 × 3 × 256. After performing convolution and pooling fifth part, the output is fed into three fully connected layers which have 4096 neurons, 4096 neurons and 1000 neurons, respectively. The activation function named ReLU [20] is applied in each layer. For obtaining maximum features of the images, the max pooling layer has been added. Various convolution operations are applied in maize images. Various types of features have been extracted in the form of edge, curve, line etc (Fig. 5).

Fig. 5
figure 5

Layering architecture of AlexNet

The extracted features are passed through the fully connected layers for the summarizing all extract features into best features and CNN classifier has provided resultant image with one-dimension that is helpful for classification. The proposed AlexNet CNN model is given by Fig. 6.

Fig. 6
figure 6

The Proposed Model of AlexNet

2.1 Input layer

The maize image as the input is applied into CNN model for testing purpose. For this work, Images are taken from the PlantVillage dataset [21]. In the dataset, Maize plant images are contained with healthy and unhealthy category. Maize images are resized uniformly in 224 × 224 × 3 (height is 224, width is 224, and number of color channel is 3) structure. Further, data augmentation operations [3] like rotation, zoom-in, zoom-out, blurring, back ground noise, brightness, darkness are applied to maize images.

2.2 Convolution layer

The convolution is the first layer of the AlexNet that performs convolution operation from the input image. Convolution layer extracts the feature of image with the use of filters to produce the output feature maps that is formulated as Eq. 1.

$${x}_{j}^{l}=f\left(\sum\nolimits _{i\in {M}_{j}}{x}_{i}^{l-1}\text{*}{x}_{ij}^{l}+{b}_{j}^{l}\right)$$
(1)

where \({x}_{j}^{l}\) is output feature map, \({M}_{j}\) is a set of input map, \({x}_{ij}^{l}\) is a kernel, and \({b}_{j}^{l}\) is a bias.

The feature map of the output is given and expressed by Eq. 2

$$O=\frac{W-K+2P}{S}+1$$
(2)

where, \(O\) is height/length of the output features, \(W\) is height/ length of the input feature, \(K\) is Filter size, \(P\) is Padding and \(S\) is stride.

The padding P is calculated as

$$P=\frac{K-1}{2}$$
(3)

where, \(K\) is the filter size.

2.3 ReLU layer (Rectified Linear Unit)

The ReLU activation function [19] use in case of non-linearity. Convolution layer perform linear operation in convolution operation. ReLU activation will apply after each convolution layer. The ReLU layer convert all the negative value into zero or positive value. ReLU activation can be expressed with Eq. 4.

$$ReLUf\left(x\right)= max\left\{\begin{array}{c}x x>0\\ 0 x\le 0\end{array}\right.$$
(4)

It is used maximum value \(f\left(x\right)\) compare the \(x\)and \(0\), if the value is negative then maximum value is \(0\). Processing of Convolution layer, Hidden layer, fully connected layer and output layer can be shown Fig. 6.

2.4 Max pooling layer

In the Max_pooling layer, pooling operation is performed where maximum element from the area of the feature map used by the filter. it means input pixel value divide into the multiple non-overlapping block and maximum element is used for generating the maximum feature map. Max_pooling reduced the output pixel size taken from the input.

2.5 Dropout layer

Dropout layer is just as a regular layer of the CNN model. The input neurons with a particular probability are drop out such that every neuron is able to calculate features that are low dependent on its surroundings. This process is complete into the training phase.

2.6 Batch normalization layer

The Batch Normalization layer is processed between the two layers such as convolution and ReLU layer. In the Batch Normalization layer, the training speed is increased and initialization of network is reduced. The activations of each channel are normalized by subtracting the mini-batch mean and dividing by the mini-batch standard deviation.

This is a \(\varvec{\beta }\) offset by the shifting input and \(\varvec{\gamma }\) is factor of scaling. These \(\varvec{\beta } \ \varvec{and} \ \varvec{\gamma }\) updated by training phase the batch normalized output \({\varvec{y}}_{\varvec{i}}\).

$${y}_{i}=B{N}_{\gamma ,\beta }\left({x}_{i}\right)\equiv \gamma {\widehat{x}}_{i}+B$$
(5)

Where \({\widehat{\varvec{x}}}_{\varvec{i}}\) is the normalization of activation \({\varvec{x}}_{\varvec{i}}\)

$${\widehat{\varvec{x}}}_{\varvec{i}}=\frac{{\varvec{x}}_{\varvec{i}}+{\varvec{\mu }}_{\varvec{B}}}{\sqrt{{\varvec{\sigma }}_{\varvec{B}}^{2}+\varvec{\epsilon}}}$$
(6)

where \(\in\) is a constant, \({\varvec{\mu }}_{\varvec{B}}\) is the mini-batch mean and \({\varvec{\sigma }}_{\varvec{B}}^{2}\) is the mini-batch.

$${\varvec{\mu }}_{\varvec{B}}=\frac{1}{\varvec{m}}\sum\nolimits _{\varvec{i}=1}^{\varvec{m}}{\varvec{x}}_{\varvec{i}}$$
(7)
$${\varvec{\sigma }}_{\varvec{B}}^{2}=\frac{1}{\varvec{m}}\sum\nolimits _{\varvec{i}=1}^{\varvec{m}}{\left({\varvec{x}}_{\varvec{i}}-{\varvec{\mu }}_{\varvec{B}}\right)}^{2}$$
(8)

where, \(\varvec{m}\) is the mini-batch size.

2.7 Fully connected layer

In the fully connected layer, all the neurons value of this layer is connected to all the neurons in the previous layer and connected to the next layer that is combining the all features from the previous layer to facilitate classification.

2.8 Softmax function

The Softmax function is used for classification layer. The Softmax operation produce the output a statistics probability distribution based classification values. The CNN model classified the maximum probability value as an output. Function can be expressed as Eq. 9.

$$P\left({c}_{r}|x,\theta \right)=\frac{P\left(x,\theta |{c}_{r}\right)P\left({c}_{r}\right)}{\sum _{j=1}^{k}P\left(x,\theta |{c}_{j}\right)P({c}_{j)}}$$
(9)

where, \(P\left({c}_{r}|x,\theta \right)\le 1\), \(\sum _{j=1}^{k}P\left(x,\theta |{c}_{j}\right)\) =1 \(P\left(x,\theta |{c}_{r}\right)\) and \(P\left(x,\theta |{c}_{r}\right)\) is the conditional entropy probability of instance given class\(r\) and is the class priori probability (Table 1).

Table 1 Layer implementation of the AlexNet Model

3 Maize Plant Leaf Disease classification and identification

3.1 Input image acquire of Maize Plant Disease

Maize leaf images are taken from PlantVillage dataset [15] for the research objective. We have taken 1363 unhealthy images where 410 images used for Cercospora leaf spot, Gray leaf spot, and 953 images used for Common rust. 929 healthy leaf images of maize plant are used for testing the disease. The qualities of acquiring dataset images are good but we are using preprocessing for resizing the image of 224 × 224 × 3 pixels and its corresponding feature maps. The dataset contains three types of diseases image of maize plant and healthy leaf. The dataset divides into two parts first one is training set, and second one is testing set. The whole Dataset have divided into 60% for training i.e. 1363 images and rest 40% images are used for testing purpose i.e. 929 images. The image dataset has mainly three various maize leaf diseases Cercospora leaf spot, Gray leaf spot, and Common rust with healthy maize leaf. Sample of PlantVillage dataset images can be shown in Fig. 1.

4 Analysis of the model

In the proposed study, AlexNet Model has been implemented for disease detection of maize leaf disease. The proposed AlexNet model has used 5 convolution layer and 3 max_pooling layers and various numbers of epochs such as 25, 50, 75, and 100 times iteration perform. We have observed that performance of accuracy is increased when the number of iteration increases. The AlexNet CNN have calculated model tested accuracy after the total number of 25 epochs is 87.2222 as shown in Table 2. Calculated model tested accuracy after the total number of epochs 50 is 98.6111 as shown in Table 4, calculated model tested accuracy after the total number of epochs 75 is 99.7222 as shown in Table 3, and same as after the total number of epochs 100 is 99.6111 as shown in Table 5. The proposed deep learning AlexNet CNN model used a loss function to optimize the algorithm of machine learning. The calculation of loss function based on the explanation training and validation model is working in two sets. Validation loss depends on the total number of images tested and the numbers of the image are not tested by the model.

Table 2 Layer implementation of the AlexNet CNN model for 25 epochs iteration
Table 3 Layer implementation of the AlexNet CNN model for 50 epochs iteration
Table 4 Layer implementation of the AlexNet CNN model for 75 epochs iteration
Table 5 Layer Implementation of the AlexNet CNN model for 100 epochs iteration
Fig. 7
figure 7

The training and validation accuracy for Maize Leaf Disease. (a) Iterated Epochs Size 25, (b) Iterated Epochs Size 50, (c) Iterated Epochs Size 75, (d) Iterated Epochs Size 100

A metric of the accuracy is used to explain the execution of the algorithms calculated data. An accuracy loss depends on the total number of the image which is not accurately tested by the proposed CNN model. Accuracy is the prediction of an accurate result as compared to the true data. The calculated data of the AlexNet CNN model graphical representation as per above table of the Training and validation accuracy for maize leaf disease shown in Fig. 7.

The experiment is performed with different epoch sizes like 25, 50, 75, and 100 for classification using AlexNet. Figure 7 show graphical representation of the training and validation accuracy for maize plant. The paper concluded that difference between trained dataset and validation data set is high when we test our experiment with epoch size 25 and 50. For getting better result, we have tested our model on epoch size 75 and 100. The experiment aimed that we have achieved tested accuracy 99.16%. Figure 8 shows graphical representation of the training and validation loss for maize plant. By the experimented observation, we found accuracy loss is high in the case of epoch size 25 and corresponding validation loss is also high. So that the poor classification is done this is not trustable tested accuracy. It gives 87.22% accuracy. Thus, epochs size has been increased to 50, 75, and 100. We noticed that accuracy loss is minimized as compares to case of epoch size 25. In this way, we reduce the tested loss till 0.0014.

Fig. 8
figure 8

The Training and Validation Loss for Maize Leaf Disease. (a) Iterated Epochs Size 25, (b) Iterated Epochs Size 50, (c) Iterated Epochs Size 75, (d) Iterated Epochs Size 100

The proposed model shows the result after epoch number 100, the accuracy for the trained dataset is 98.89% and according to the accuracy loss is 06.27% both of the training and testing dataset. Validation accuracy is 76.11% and.

Validation loss is 24.04% depending upon the model testing the dataset. The experimental results of training and validation accuracy for the maize leaf dataset and training and validation loss can be seen in Table 2, 3, 4, and 5. The efficiency of the model is established by comparing with other soft computing techniques VGG16, SVM, and ANN can be seen by Table 6.

Table 6 Performance evaluation of proposed CNN model

VGG16 give the better classification result when volume of dataset is higher. It does not give efficient result in the case smaller dataset. The other challenging issue with VGG 16 is that it does not accept the leaf image having size of less than 224\(\times\)224. If we proceed with low size of image, then it does not provide expected results. SVM are commonly applied as classification model for plant disease detection and classification. It applied when two classes have separate margin means i.e. classes have best support. SVM is not suited for large volume of datasets as VGG16. If noise is occurred in the leaf images, then it does not give proper results i.e. objective classes will have overlapped. Other challenging issue with SVM is that it becomes fail when number of features for every data point surpasses the no. of training data samples. ANN is type of neural network used as classification model. ANN is from of interconnection of neurons. ANN has the ability to handle multiple features of plant leaf at the same time. But the training process of ANN cannot stop suddenly. Due to the unexplained behavior, ANN does not give efficient plant classification results.

5 Conclusions

We have proposed a deep neural network for the identification and classification Maize leaf diseases. As we aware with the losses due to the biological factors influencing the plant health. This not only influences the crop productivity but also affects the nature as well. Also, a major impact can be seen on the wealth of farmers. Therefore, it becomes essential to introduce some artificial intelligence to extract the diseases on time and furthermore treating them as well. For this study, when experimental results are validated on diseases and when compared with different methods prove that the proposed method achieves higher performance with an accuracy of 99.16%. During the analysis, main challenge is emerged in large number of iterated AlexNet CNN associated with epoch iteration and control the large number of datasets of maize disease. This study can be efficiently applie so that it gives instant information about maize disease. It also reduces the Outbreaks, upsurges, which causes huge losses to maize crops and pastures and threatening the livelihoods of vulnerable farmers.