1 Introduction

Plants are the fundamental components of the earth liable for safeguarding the World’s environment (Narayan and Subbarayan 2014). They offer fuel, medicines, sustenance, and also maintain a good climate. Due to the never-ending de-forestation (Kalyoncu and Toygar 2015; Horaisova and Kukal 2016), numerous plants are in the stage of extinction. Hence, a necessary pace is required to observe the plants by creating a database for efficient and quick classification as well as grouping (Kalyoncu and Toygar 2016). Among these, most of the systems are based on visual component extraction such as shape, texture, and hue along with their pictures as information models for classification and correlations (Rhouma et al. 2017). Even though several plant portions such as root, seed, natural product, bud, blossom are used for comparison, leaf oriented classification is the major accepted and feasible technique. The plant leaves for identification consist of significant features like leaf edges, vein, texture, color, shape, and leaf type (compound or simple). The image processing algorithms using leaf images are used by the computer systems for recognizing the plant species. Rather than the roots and stalks, it is simple to scan the leaves of plants by scanners or take a snap using the digital cameras (Zhang et al. 2013). Hence, the necessary part lies in the composition of the leaves database.

The primary task is to minimize the dimension of the unarranged data about the leaf images (Caoa et al. 2016). An efficient algorithm for reducing the dimension analyzes the data effectively and also makes the decisions. The problem of ‘‘curse of dimensionality’’ should be avoided. Various dimensional reduction techniques have been presented in recent decades (Zhang et al. 2016; Tang et al. 2015). These techniques can be categorized as semi-supervised, supervised, and unsupervised on the basis of the input sample’s class information. Compared to the unsupervised techniques, supervised techniques are more applicable for the classification tasks. The label information provides the dimensionality reduction procedure (Du et al. 2007) for the classification problems. The significant characteristic of an automated plant identification system is the suitable selection of leaf features. Various techniques involving CNN oriented method are available for the leaf identification and the major idea behind the proposed method is significant in the process of identification (Kadir et al. 2013; Mallah et al. 2013). During the training process, the CNN needs a vast quantity of manual data so it is mostly used in applications where the dataset is found to be less.

The techniques of multi-feature fusion are also used for the descriptions of leaves (Ye and Weng 2011). Various works learned the classifiers using the pattern matching or machine learning methods. Some of the representative algorithms are the DP, KNN, and SVM. Nowadays, deep learning methods classifies or signifies the leaf images. Yet, the drawbacks of deep learning exist (Hu et al. 2018). The basis of high classification accuracy lies in network having adequate supervised learning samples, and it is normally too tough. The majority of the cases return only a limited count of learning samples, and in these cases, general deep learning neural networks return very poor results (Wu et al. 2014). Thus, novel methods must be produced that accomplish the below objectives. The classification techniques are aimed to learn from less quantity of supervised samples (Singh et al. 2019). Using fewer samples, human beings can precisely and speedily classify the techniques. The introduction of new samples can allow humans to conclude precise judgments through measurements (Longlong et al. 2015). Through the training network, it is suggested to design a measurement technique to make them learn from little samples for applying it to the automatic leaf classification.

A pixel that denotes an area of an untrained class is able to portray only the classification label to the trained classes. The impact behind the accuracy of an untrained class for the plant leaf classification is based on the threshold computation. Here, the impact of the untrained classes on the plant leaf classification is done with the help of the algorithms that generate the absolute and the relative measures of threshold (Seeland et al. 2019). The availability of the untrained class can decrease the accuracy of the plant leaf classification. Hence, it is considered that even though the group of classes that are described in the training phase of the classification is comprehensive and includes all the classes, classifying the plant leaves for untrained data still remains as a challenging one. For improving the classification, the database was enlarged by applying the sample augment in the images (Liu 2018). The challenges of classifying the leaf images due to low inter-class variability are overcome by an automatic discriminative method on the basis of CNN (Tavakoli et al. 2021). Deep learning (Srinivas and Manivannan 2020) is the popular machine learning algorithm for the plant leaf classification. The shapes of leaves of the plants are very significant as they help the plant species and measure their health (Bhambere 2011).

The main contributions related to this paper are described below.

  • A new deep learning-based plant leaf classification model is introduced for classifying the untrained images by observing the classification score and the classification label, thereby enhancing the classification rate for trained as well as untrained data.

  • A new hybrid optimization algorithm known as the SS-WOA is introduced that provides higher efficiency and probability in finding the global optima, less computational time, fast convergence, and also solves problems for finding the accurate mathematical models, thus suitable for enhancing the CNN for handling untrained data in plant leaf classification.

  • The proposed SS-WOA is validated over different machine learning and optimization algorithms to determine its superiority in achieving the high classification accuracy for the trained as well as the untrained data.

The organization is enlisted in the below manner: Sect. 1 provides the introduction of the plant leaf classification with consideration towards the challenge in untrained data. Section 2 offers various literature works in classifying the plant leaves. The proposed plant leaf classification model for challenging untrained data is explained in Sect. 3. The pre-processing and objective model for plant leaf classification is described in Sect. 4. Section 5 provides a description of the improved algorithm for optimized threshold-based CNN applicable for classifying the plant leaves. The results and discussions are discussed in Sect. 6. In the final step, Sect. 7 finishes with conclusion.

2 Literature survey

In Bin and Wang (2019) have addressed a few-shot learning technique on the basis of the Siamese network framework for handling a problem of leaf classification having a less sample size. Initially, the features were extracted from two various images using a parallel two-way CNN having weight sharing. Next, a loss function was used by the network for the metric space learning, where identical leaf samples were near to each other and distant samples were apart from each other. Additionally, a SSO technique was developed to build the metric space that, in turn, enhanced the leaf accuracy classification. In the final step, the leaves present in the learned metric space were classified by a kNN classifier. The performance measure has used the average classification accuracy. The performance of the technique was estimated by the “Leafsnap, Swedish, and Flavia datasets”. The experimental outcomes revealed that with a less supervised sample size, the developed technique resulted in high classification accuracy.

In Mostajer and Asghari (2019), have examined a new technique for the plant species recognition by means of GIST texture features. Next, the PCA algorithm has chosen the necessary and the principal features. The extracted features have undergone the classification phase with three techniques like KNN, SVM, and Patternnet NN algorithms. The developed algorithm was applied to three popular datasets. The outcomes have exceeded various techniques with respect to accuracy and time. The superior outcomes were attained by means of Cosine KNN classifier as well as applying the PCA to the GIST feature vector.

In Rhouma et al. (2017), have developed seven novel invariants for the different shapes, and these were tried on the problem of leaf classification. One among the novel invariants was described for the different shapes and it was an area oriented method of the existing boundary-oriented measure of anisotropy. The remaining six invariants were entirely a novelty one. They were based on the technique of the geometric distribution of the initial two Hu moment invariant. All the proposed invariants were measurable from the geometric relative to the shape parts. This resulted in a simpler computation of scaling, rotation, and translation invariants. The novel invariants were vigorous to mild as well as noise deformations. Various desirable properties were experimentally estimated on a vast count of artificial illustrations. The application of the novel different shape invariants was described on a familiar leaf data set.

In Parekh et al. (2018), have labeled another path for the plant species classification using the digital leaf images. The leaves of plants were composed of a collection of unmistakable elements such as unique vein modelled surfaces, compound, and simple shape, and non-green and green hue. A private arrangement of elements could not be appropriate for a feasible heterogeneous plant sort classification. A hierarchical architectural model was developed by combining various components to retrieve a powerful visual data classification. The study combined the classifiers and the feature extraction modules that resulted in superior execution. The database was partitioned in obvious components using visual discriminators for improved proficiency. Novel layers were included in the present system that provided the adaptability. Several leaf sorts revealed their shape features with the help of FSST. The experiment was conducted on two publically available databases involving “non-green, green, and compound and simple leaves” with the alterations in “design, size, and shape” that determined the superiority of the developed technique over various class procedures.

In Chen et al. (2019) have addressed a unified multi-scale technique for the leaf image retrieval and classification for capturing the geometric information of the leaves. An efficient three-step strategy was used by a unified multi-scale technique to position the respective neighbour points for every point present on the leaf contour. The descriptor offered a fine depiction of contours of leaf. It was composed of natural distinct characteristics. Due to the absence of scale parameters, there was no need for optimisation procedures. The developed technique was given to three familiar contour features like triangle-area, arch-height, and angle representation for capturing the geometric information present in the leaves. The features available in the unified multi-scale method were applied with FFT for quick and suitable leaf matching. The image retrieval and classification experiments were investigated on four datasets with the help of three standard performance evaluation measures. The unified multi-scale technique revealed better results.

In Qureshi et al. (Saleem et al. 2019) have estimated various handcrafted visual leaf features, along with their classification techniques and extraction processes. A novel algorithm was proposed for the recognising the type of plant via the images of leaf that consisted of “image pre-processing, segmentation, feature extraction, dimensionality reduction, and classification steps”. The developed algorithm was examined on a ‘Flavia’ dataset and on a synthetic dataset. It was also tested with distinct classifiers like multi-SVM, NB, DT, and KNN. The KNN revealed the recall and precision values of 98.8% and 97.6% respectively with the ‘Flavia’ dataset. The synthetic dataset returned 97.3% and 96.1% recalls and precision measure outcomes. This technique was a precise plant type recognition approach in real time situations. The classification was also differentiated with AlexNet, a CNN oriented technique. It was confirmed that the handcrafted feature-oriented technique exceeded AlexNet with respect to robustness when using a small training dataset.

In Zhou et al. (2017) have developed an automatic classification method for the medicinal plant’s leaf images that labelled the shortcomings of manual classification technique in recognizing the medicinal plants. Initially, the leaf images of the medicinal plants were pre processed by this technique. In the next step, five texture characteristics and ten shape feature were computed. In the final step, the leaves of the medicinal plants were classified by the SVM. It was subjected to 12 various leaf images and a better recognition rate was achieved. The results demonstrated that it was possible to categorize the medicinal plants in an automatic manner using the multi-feature extraction and SVM. It offered a superior framework for the development and research of medicinal plant classification system.

In Yingke et al. (2015), have proposed a novel weight measure, and next a dimensional reduction algorithm known as SSODP, was developed. SSODP used the unlabeled and labelled data that built the weight by combining “the class information, the local neighborhood structure, and the reliability information of the data”. SSODP was more efficient with respect to the plant leaf classification rate.

In Guoqing Xu et al. (2020), developed a multi-granular angle feature descriptor on the basis of quotient space for the classification of plants and the retrieval tasks. The angle features are extracted by the descriptor from the contour points of the leaf under different granularities. This method can capture both global and local information of the leaf contour. The pair of leaves is estimated for similarity by the multi-granular angle feature. The retrieval performance of this method is very promising. The optimal parameter of MCC is difficult to determine.

Mohammad Keivani (Mohammad Keivani1, Jalil Mazloum, Ezatollah Sedaghatfar 2020) proposed a method, called PBPSO for reducing the features. This research is mainly for plant identification in agriculture for the purpose of image processing. This field got less attention than all the other application domains. This method is defined as the plant identification system. The specimens are identified quickly and categorize by the image processing technique. This method performs better only when the number of features reaches 300.

In Amgad et al. (Fati 2020), developed an efficient and automatic classification system for observing Malaysian herbs which is useful for medical and cooking areas. This system consists of two classifiers called SVM and DLNN. Both the algorithms were tested with the same dataset and the DLNN algorithm was more appropriate. The mobile app was not totally improved. It works perfectly in windows platform, for the Android mobile app the packing was unsuccessful. This method is used to detect the herb leaves even when they are wet, dried and deformed (Table 1).

Table 1 Features and challenges of state-of-the-art leaf classification models

3 Proposed plant leaf classification model for challenging untrained data

3.1 Proposed architectural model

The plant leaf classification is an approach, in which the leaf is classified on the basis of various morphological features. The classification of plant leaves is a complex task in botany, cotton, tea, and various industries. In general, the extraction of leaf features like color feature, texture feature, and the shape feature is significant for classifying the leaf images. On the basis of the extracted features, the pattern matching or machine learning is utilized for classifying the leaves. These techniques use the manual feature extraction for the leaf representation and then several machine learning techniques are utilized for the classification. Nowadays, deep learning techniques like DBN and CNN reveal better results in classifying the leaves. But, there occurs a major challenge during the classification of the untrained images. The untrained classification is distinct from the trained classification, in which it needs only the input data. The majority of the untrained classification represents the cluster analysis where a group of data is clustered in a form such that the items in every cluster are identical to each other than with the remaining clusters. There are no required outcomes with the untrained classification. The machine permits to input the data into a machine learning algorithm that describes the normality of a specific group of data. In the untrained classification, the outputs are not chosen manually. These challenges can be overcome by observing the classification score rather than the classification label in the proposed plant leaf classification model. The proposed architectural representation of leaf classification for untrained data is displayed in Fig. 1.

Fig. 1
figure 1

Proposed architecture of leaf classification for untrained data

The proposed method of plant leaf classification for untrained data mainly consists of phases like, “Data acquisition, pre-processing, and classification”. In the initial phase, the data regarding the plant leaves are collected from the Swedish leaf dataset and the Mendeley dataset, which are the standard benchmark datasets. The Swedish leaf dataset performs the training the data and the Mendeley dataset performs the testing the data, which is also referred to here as the untrained data. Once the data are collected, the next phase of pre-processing begins. It is done to improve the image quality such that the image should be applicable for next processing. Here, the pre-processing is accomplished by means of three techniques such as RGB to Gray conversion, histogram equalization, and median filtering. In RGB to gray conversion, the RGB values for every pixel are taken as the input and the output is provided as a single value that reflects the brightness of that corresponding pixel. Histogram equalization is used to process the images for adjusting its contrast by altering the intensity distribution of the histogram, thereby providing a linear trend to the cumulative probability function that is being linked to the image. The median filtering is considered as a non-linear digital filtering method that is used for removing the noise present in the image. Once the pre-processed image is obtained, it is subjected to the final classification phase. The classification is done by the deep learning model called CNN, in which an improvement is made by optimizing the activation function and the hidden neurons with the help of the proposed SS-WOA. Since the classification of the untrained images is taken as a challenging task, the optimized threshold-based CNN is introduced for performing the leaf classification of the untrained data. Here, the optimization is performed by the same proposed SS-WOA for attaining maximum classification accuracy for the untrained data. Rather than the classification label, the major intention is to observe the classification score. The classification of the untrained images is done not only on the basis of the high classification score but also on the basis of the range greater than the threshold value. If the classification score is greater than the value of threshold, then the corresponding leaf type is predicted as output. Otherwise, the output is predicted as unknown leaf type. Consider the image database as \(DA = \left\{ {Y_{{mq}}^{{in}} } \right\}\), where \(mq = 1,2, \cdots MQ\) and \(MQ\) denotes the total images in the database.

3.2 Optimized CNN model for untrained data

The deep learning model called CNN is used to classify the plant leaves for the untrained data. The major advantage of the CNN is that it detects the significant features in an automatic manner without the support of any human. It reduces the computational complexity without the data essence loss. It is very effective in classifying the images with the same knowledge across various locations of image. It is very fast in training a model. Moreover, CNNs act as the powerful tools for extracting the general purpose features that can work well for the unknown classes. The nonlinearities are introduced to the CNN by the activation function and it is applicable for the multi-layer networks for detecting the nonlinear features. The spatial hierarchies of the features are adaptively and automatically learned by the CNN via the back propagation with the help of multiple building blocks like fully connected layers, pooling layers, and convolution layers. Since the classification of the untrained images is considered as a challenging task here, the optimized threshold-based CNN is developed. Instead of the classification label, it observes the classification score. The proposed SS-WOA optimizes the threshold value for achieving the maximum classification accuracy of the untrained data. The classification of the untrained images is done on the basis of a high classification score and the range greater than the threshold value. The CNNs (Rawat and Wang June 2017) are feed forward networks. The flow of information happens in one direction. Similar to the ANN, CNNs are also motivated in a biological manner. The architecture of CNN arises in different variations. They are composed of convolutional and subsampling or pooling layers that are grouped to form modules that are followed by one or multiple layers. Modules lead to a deep model by piling on top of every other. Figure 2 depicts the typical CNN architecture for a leaf classification task. An image is directly given as input to the network and it is followed by various pooling and convolution stages. One or multiple fully connected layers are fed by these representations from the operations. In the last step, the class label is set as an output by the final fully connected layer. Recently, numerous architecture variations are developed with the aim of minimizing the computation costs or enhancing image classification accuracy.

Fig. 2
figure 2

Optimized CNN model for the untrained data

3.2.1 Convolutional layers

The convolutional layers are the feature extractors. The feature representations of their related input images are learnt by them. In convolutional layers, the neurons are sorted into feature maps. The neurons present in the feature map contain a receptive field. It is joined to the neuron’s neighbourhood of earlier layer through a group of trainable weights called filter bank. A novel feature map is computed by convolving the inputs with the learned weights. The convolved outcomes are passed via the neurons within a feature map that consists of weights nonlinear activation function. The neurons consist of weights with equal conditions. Inside the same convolutional layer, several feature maps are composed of distinct weights to be extracted at every location. In general, the \(kf^{{th}}\) output feature map \(Yq_{{kf}}\) is computed as in Eq. (1).

$$Yq_{{kf}} = f\left( {Wb_{{kf}} *Y_{{mq}}^{{med}} } \right)$$
(1)

Here, the nonlinear activation function is denoted by \(f\left( \cdot \right)\), the 2D convolutional operator is described by the multiplication sign, \(Wb_{{kf}}\) describes the convolutional filter linked to the \(kf^{{th}}\) feature map, and \(Y_{{mq}}^{{med}}\) describes the input pre-processed image.

3.2.2 Pooling Layers

The spatial resolution present in the feature maps are minimized by the pooling layers. The spatial invariance is achieved to the translations and input distortions. The average of the entire input values are propagated by the pooling aggregation layers. Every output map joins the multiple input maps with the convolution as in Eq. (2).

$$Y_{{jf}}^{{CL}} = f\left( {\sum\nolimits_{{iqMI_{{jf}} }} {Y_{{jf}}^{{CL - 1}} } *KM_{{iqjf}}^{{CL}} + ab_{{jf}}^{{CL}} } \right)$$
(2)

In the above equation, a selection of input maps is described by \(MI_{{jf}}\), \(CL\) represents the convolutional layer with \(iq^{{th}}\) input and \(jf^{{th}}\) output, the additive bias of \(CL\) convolutional layer is described by \(ab^{{CL}}\), kernel maps of \(CL\) convolutional layer is described by \(KM_{{iqjf}}\), the downsampling layer is described by \(CL - 1\), and the input features of the \(CL - 1\) convolutional layer is described by \(Y^{{CL}}\).

3.2.3 Fully Connected Layers

Many abstract feature representations are extracted by grouping the various pooling and the convolutional layers. These feature representations are interpreted by the fully connected layers and the high-level reasoning function is performed.

3.2.4 Training

The free parameters are adjusted using the learning algorithms for obtaining the desired network output. Back propagation is the familiar algorithm used for this purpose. In order to improve the existing CNN architecture, the activation function, hidden neurons, and threshold of the CNN are optimized by the proposed SS-WOA to attain maximum accuracy.

4 Pre-processing and objective model for plant leaf classification

4.1 Dataset description

There are a total of two datasets being gathered for this leaf classification of untrained data.

Dataset 1: The first dataset is known as the Swedish Leaf Dataset. This dataset is used for the purpose of training. This dataset is collected from the link “http://www.cvl.isy.liu.se/en/research/datasets/swedish-leaf/”. It is composed of 15 leaf species having 75 images per species. This dataset is commonly employed to estimate the shape matching methods. There are various clear characteristics available on this dataset. Here, the leaves are aligned manually with small rotation. Some of the sample images containing leaves from 15 tree classes from dataset 1are displayed in Table 2.

Table 2 Sample leaf images from Dataset 1

Dataset 2: The second dataset is called the Mendeley Data. This second dataset is used during the process of testing. It is gathered from the link “https://data.mendeley.com/datasets/hb74ynkjcn/1”. Here, twelve environmentally and economically beneficial plants are considered for the testing process. Some of the sample images related to the dataset 2 is listed in Table 3.

Table 3 Sample leaf images from Dataset 2

4.2 Pre-processing

The pre-processing is performed at the lowest level of abstraction for enhancing the image data that restrain the undesired distortions. It improves some features of the image that is necessary for further processing. The pre-processed image does not contain any impurities, and so it becomes better to be applied for the upcoming classification phase. Here, the pre-processing is performed using RGB to Gray conversion, histogram equalization, and median filtering.

4.2.1 RGB to Gray Conversion

The gray scale image is observed to be in black and white that is composed of gray shades. The intensity of the light is described with the help of the value of every pixel. The RGB to gray conversion is done by averaging the sum of three colors called R, G, and B and then dividing it by 3. The final gray scale image is represented by \(Y_{{mq}}^{{gray}}\).

4.2.2 Histogram Equalization

Histogram oriented methods for the image enhancement is dependent on equalizing the image histogram and enhancing the dynamic range related to the image. HE (Yeganeh et al. 2008) accomplishes the contrast enhancement owing to its effectiveness and simplicity. HE uniformly distributes the pixel values and an enhanced image is produced that contains the linear cumulative histogram. Various applications of HE enhancement involve texture synthesis, speech recognition, and medical image processing that are being used with the histogram modification. Histogram oriented methods for the image enhancement is dependent on enhancing the dynamic range that is being related to the image and equalizing the histogram of the image. A digital image is considered that contains gray levels in the range of \(\left[ {0,LR - 1} \right]\). Equation (3) calculates the probability distribution function of the image.

$$Pd\left( {rg_{{ka}} } \right) = \frac{{np_{{ka}} }}{{NP}}\quad ka = 0, \cdots ,LR - 1$$
(3)

Here, the \(ka^{{th}}\) gray level is represented by \(rg_{{ka}}\), the count of pixels is denoted by \(np_{{ka}}\) and it is present in the image with gray level \(rg_{{ka}}\). Equation (4) describes the Cumulative Distribution Function.

$$\begin{gathered} CDF\left( {rg_{{ka}} } \right) = \sum\limits_{{ia = 0}}^{{ia = ka}} {Pd\left( {rg_{{ia}} } \right)} \hfill \\ ka = 0, \cdots ,LR - 1,\quad 0 \le CDF\left( {rg_{{ka}} } \right) \le 1 \hfill \\ \end{gathered}$$
(4)

The gray level \(SG_{{ka}}\) is appropriated to the gray level \(rg_{{ka}}\) of the input image with the help of Eq. (4). Hence, Eq. (5) formulates as below.

$$SG_{{ka}} = \left( {LR - 1} \right) \times CDF\left( {rg_{{ka}} } \right)$$
(5)

The variation of gray level \(SG_{{ka}}\) is calculated using the usual histogram equalization technique as in Eq. (6).

$$\Delta SG_{{ka}} = \left( {LR - 1} \right) \times Pd\left( {rg_{{ka}} } \right)$$
(6)

In the above equation, the distance among the terms \(SG_{{ka}}\) and \(SG_{{ka}} + 1\) has direct contact with probability distribution function of the input image at the gray level \(rg_{{ka}}\). The histogram equalized image is represented as \(Y_{{mq}}^{{his}}\).

4.2.3 Median filtering

The median filter (Failed 2011) represents a nonlinear signal processing technology on the basis of statistics. The noisy value of a sequence is substituted using the filter mask’s median. The pixels are ranked in the range of their corresponding gray levels. The mask’s median is used to substitute the noisy value. The output is represented by \(Y_{{mq}}^{{med}} \left( {xz,yz} \right) = med\left\{ {Y_{{mq}}^{{his}} \left( {xz - iz,yz - jz} \right)\,,\quad iz,jz \in DM} \right\}\), in which the output and the original image are represented by \(Y_{{mq}}^{{med}} \left( {xz,yz} \right)\) and \(Y_{{mq}}^{{his}} \left( {xz,yz} \right)\) respectively; a two-dimensional mask is denoted by \(DM\); the size of the mask is denoted by \(sn \times sn\), in which \(sn\) is mostly odd like 7 × 7, 9 × 9, etc.; the shape of the mask is either cross, circular, square, linear, etc.

4.3 Objective model

The main aim of the developed leaf classification for the untrained data is to maximize the accuracy. Here, the activation function, hidden neurons and the threshold of the CNN is optimized by the proposed SS-WOA. Apart from the classification label, the classification score is mainly considered here. The final classification pertain to the untrained data fully depends on the classification score and the optimized threshold value. The objective function of the proposed SS-WOA-based leaf classification for the untrained data is described in Eq. (7).

$$Ob\;fun = \mathop {\arg \max }\limits_{{\left\{ {AF,HN,TH} \right\}}} \left( {Acy} \right)$$
(7)

Here, in Eq. (7), \(AF\) denotes the activation function, \(HN\) denotes the hidden neurons, and \(TH\) denotes the threshold of the CNN that are to be optimized by the presented SS-WOA. The accuracy is shown in Eq. (8).

$$Acy = \frac{{trp + trn}}{{trp + trn + fap + fan}}$$
(8)

Here, \(Acy\) denotes the accuracy, “\(trp\), \(trn\), \(fap\), and \(fan\)”represent the “true positive, true negative, false positive, and false negative” respectively.

5 Improved algorithm for optimized threshold-based CNN applicable for plant leaf classification

5.1 Proposed SS-WOA

The proposed SS-WOA is used for performing the plant leaf classification for the untrained data. The optimization algorithm is used to find the unconstrained minima or maxima or optimum solution of the differentiable and the continuous functions. CNNs are broadly used in the image classification tasks. It is composed of numerous parameters for generating better classification accuracy for a certain task on the basis of distinct parameters such as neuron count, layer count, size of the filter, and input window size etc. Here, the proposed SS-WOA optimizes the activation function, hidden neurons, and the threshold of the CNN for classifying the plant leaves of the untrained data. WOA (Mirjalili and Lewis 2016) criticizes the social characteristics of the humpback whales. It is motivated by the bubble-net hunting method.

5.1.1 Encircling the prey

The best position present in the search space is unknown earlier, and hence the present best candidate solution is near to the optimum or it is the target prey. Once the best search agent is described, the remaining search agents update their locations in the path of the best search agent. This characteristic is mathematically modelled in Eqs. (9) and (10).

$$D\vec{Q} = \left| {C\vec{V} \cdot X\vec{a}*\left( {kp} \right) - X\vec{a}\left( {kp} \right)} \right|$$
(9)
$$X\vec{a}\left( {kp + 1} \right) = X\vec{a}*\left( {kp} \right) - A\vec{V} \cdot D\vec{Q}$$
(10)

Here, the element-by-element multiplication is described by ‘\(\cdot\)’, the absolute value is described by | |, the position vector is described by \(X\vec{a}\), the position vector of the optimal solution attained is described by \(Xa*\), the coefficient vectors are described by \(C\vec{V}\) and \(A\vec{V}\), and the present iteration is described by \(kp\). Equations (11) and (12) describes the computation of the vectors \(A\vec{V}\) and \(C\vec{V}\).

$$A\vec{V} = 2a\vec{r} \cdot RN\vec{1} \cdot a\vec{r}$$
(11)
$$C\vec{V} = 2 \cdot RN\vec{1}$$
(12)

In the above equations, the random vector is described by \(RN\vec{1}\), and for both exploitation as well as the exploration phases, \(a\vec{r}\) is linearly minimized from 2 to 0 over the iterations.

5.1.2 Bubble-net attacking technique (exploitation phase)

Two techniques are used to describe the bubble-net characteristics of the humpback whales.

5.1.3 Shrinking encircling mechanism

This characteristic arises by minimizing the \(a\vec{r}\) in Eq. (3). The range of \(A\vec{V}\) is also minimized by the \(a\vec{r}\). Placing the random values for \(A\vec{V}\) in the interval range of [-1,1] leads to describe the novel search agent position among the original agent position and the present optimal position.

5.2 Spiral updating position

Initially this technique computes the distance among the whale positioned at \(\left( {Xa,Ya} \right)\) as well as prey positioned at \(\left( {Xa*,Ya*} \right)\). The helix-shaped movement present in the humpback whales is criticised by generating the spiral equation among the position of prey and the whale as in Eq. (13).

$$X\vec{a}\left( {kp + 1} \right) = D\vec{Q^{\prime}} \cdot e^{{bcRN2}} \cdot \cos \left( {2\pi RN2} \right) + X\vec{a}*\left( {kp} \right)$$
(13)

Here, the element-by-element multiplication is described by ‘\(\cdot\)’, a random number is described by \(RN2\), a constant is denoted by \(bc\), the distance of the \(is^{{th}}\) whale to the prey called as the best solution is described by \(D\vec{Q^{\prime}} = \left| {X\vec{a}*\left( {kp} \right) - X\vec{a}\left( {kp} \right)} \right|\).

This simultaneous characteristic is designed by assuming a probability to select among either the spiral model or the shrinking encircling mechanism to modify the whales position in the optimization process. The mathematical model is displayed in Eq. (14).

$$X\vec{a}\left( {kp + 1} \right) = \left\{ {\begin{array}{*{20}c} {X\vec{a}*\left( {kp} \right) - A\vec{V} \cdot D\vec{Q}} & {if\;RN3 < 0.5} \\ {D\vec{Q^{\prime}} \cdot e^{{bcRN2}} \cdot \cos \left( {2\pi RN2} \right) + X\vec{a}*\left( {kp} \right)} & {if\;RN3 \ge 0.5} \\ \end{array} } \right.$$
(14)

In the above equation, a random number is denoted by \(RN3\).

5.2.1 Prey search (exploration phase)

On the basis of the variation of the \(A\vec{V}\) vector, the similar technique can be employed for the prey searching (exploration). Rather than the best search agent, the search agent position is modified in the exploration phase on the basis of the randomly selected search agent. This mechanism and \(\left| {A\vec{V}} \right| > 1\) utilizes the exploration and permits the WOA algorithm to undergo a global search. It is shown in Eqs. (15) and (16).

$$D\vec{Q} = \left| {C\vec{V} \cdot X\vec{a}_{{rnd}} - X\vec{a}} \right|$$
(15)
$$X\vec{a}\left( {kp + 1} \right) = X\vec{a}_{{rnd}} \cdot A\vec{V} \cdot D\vec{Q}$$
(16)

Here, a random position vector is described by \(X\vec{a}_{{rnd}}\). WOA has numerous advantages like less parameter count, handling optimization problems very effectively, etc. Amidst various advantages, it suffers from some limitations such as slow convergence speed, bad search space exploration capability, etc. Hence, to overcome these limitations, SSO is combined into it and the so form hybrid meta-heuristic algorithm is called as SS-WOA. SSO has various advantages like better convergence behaviour, handling real-world engineering problems, etc. SSO (Failed 2014) is based on the smelling behaviour of the shark. One of the major efficient senses of the shark is the nose. The concentration is a major factor for the movement of the shark towards its prey. This behaviour is used to find the optimization problem solution.

In the proposed SS-WOA, if a random number \(\left( {RN3 < 0.5} \right)\) it checks the condition whether \(\left( {\left| {AV} \right| < 1} \right)\). If this condition is satisfied, then forward movement of SSO takes place using Eq. (17).

$$Ya_{{is}}^{{kp + 1}} = Xa_{{is}}^{{kp}} + VV_{{is}}^{{kp}} \cdot \Delta ti_{{kp}} \quad is = 1, \cdots ,PS\quad kp = 1, \cdots kp_{{\max }}$$
(17)

In the above equation, the time interval of the stage \(kp\) is represented by \(\Delta ti_{{kp}}\). It can be revealed for all stages that \(\Delta ti_{{kp}} = 1\). Here, the terms \(Xa_{{is}}^{{kp}}\) and \(VV_{{is}}^{{kp}}\) are described as in Eqs. (18) and (19).

$$Xa_{{is}}^{1} = \left[ {xa_{{is,1}}^{1} ,xa_{{is,2}}^{1} , \cdots ,xa_{{is,DV}}^{{}} 1} \right],\,\;is = 1, \cdots ,PS$$
(18)
$$VV_{{is}}^{{kp}} = \eta _{{kp}} \cdot RN1 \cdot \nabla \left( {OB} \right)|xa_{{is}}^{{kp}} ,\;is = 1, \cdots ,PS,\;kp = 1, \cdots ,kp_{{\max }}$$
(19)

In the above equation, the count of decision variables of the problem of optimization is denoted by \(DV\), and the \(js{\text{th}}\) decision variable of the \(is\)th individual \(Xa_{{is}}^{1}\) or the \(js{\text{th}}\) dimension of the \(is\)th position of the shark is denoted by \(xa_{{is,js}}^{1}\). Here, the gradient is denoted by \(\nabla \left( {OB} \right)\), and the objective function is denoted by \(OB\). The term \(VV_{{is}}^{{kp}}\) represents the velocity of the shark in every stage, and \(kp_{{\max }}\) represents the count of stages. The stage number is denoted by the superscript \(kp\). The random number is denoted by \(RN1\). If \(\left( {\left| {AV} \right| \ge 1} \right)\), the solution is updated by WOA using Eq. (16).

It again checks the condition whether \(\left( {RN3 \ge 0.5} \right)\). In this case, if \(\left( {\left| {AV} \right| \ge 1} \right)\), then the update takes place using the rotation movement of SSO as in Eq. (20).

$$\begin{gathered} Za_{{is}}^{{kp + 1,ms}} = Ya_{{is}}^{{kp + 1}} + RN3 \cdot Ya_{{is}}^{{kp + 1}} \hfill \\ ms = 1, \cdots ,Ms\quad is = 1, \cdots ,PS\quad kp = 1, \cdots ,kp_{{\max }} \hfill \\ \end{gathered}$$
(20)

Here, a random number is represented by \(RN3\), and the count of points present in the local search of each stage is denoted by \(Ms\). Or else, if \(\left( {\left| {AV} \right| < 1} \right)\), then the current search agent position of WOA is updated using Eq. (13). The pseudo code of the presented SS-WOA is displayed in Algorithm 1 and the flowchart of the presented SS-WOA is listed in Fig. 3.

Fig. 3
figure 3

Flowchart of the proposed SS-WOA

figure a

Optimization algorithms handle complex problems via different alterations and improvements (Swamy et al. 2013). A hybrid optimization algorithm (Marsaline Beno et al. 2014) is formed by combining multiple optimization mechanisms or principles. Hybrid optimization algorithms can solve specific search problems. It also returns fast convergence.

5.3 Solution encoding

The solution encoding of the proposed SS-WOA-based leaf classification for the untrained data is displayed in Fig. 4. The activation function, hidden neurons, and threshold of the CNN are optimized by the proposed SS-WOA. The bounding limit of the activation function lies in between the range of (1–4), for hidden neurons, the bounding limit lies in between the range of (5–255), and the bounding limit of the threshold lies in between the range of (0.25–0.75). As per Fig. 2, the threshold value is fixed for classifying the untrained data using CNN-based classification. The CNN usually provides the classification score and label. In the proposed model, for improvising the classification accuracy of untrained data, the classification score is considered for fixing the label if it handles the untrained data. Since, the deep learning models are supervised learning, it depends on the target, and it will be more close to the training data. Thus, while handling the untrained data, the misclassification problem occurs as it selects the classification label from the trained data. In order to solve this challenge, a threshold is fixed to decide the optimal classification label. If the classification score is greater than threshold, the classification label is correct, or else it belongs to untrained data. Since the classification of untrained data depends on the threshold value, selecting the optimal threshold is considered as the challenging task here. Hence, by fixing the objective function as accuracy, the proposed SS-WOA tunes the threshold value, thus attaining the faster convergence rate with correct label classification.

Fig. 4
figure 4

Solution encoding of SS-WOA-based untrained leaf classification

In Fig. 3, the terms \(AF,\;HN\) and \(TH\) represent the activation function, hidden neurons, and threshold of the CNN that are to be optimized by the proposed SS-WOA. The four activation functions used are logistic, Tanh, ReLU, and Leaky ReLU functions. A logistic function represents a common sigmoid curve that models the real-life quantities, in which the growth levels off due o the growth variation rate from an increasing to a decreasing growth rate. The tanh function is similar to the logistic sigmoid, in which the range varies from (-1 to 1). Here, “the negative inputs are mapped strongly negative and the zero inputs are mapped near zero”. The rectified linear activation function, otherwise known as ReLU represents a piecewise linear function that outputs the input in a direct manner if it is positive, or else it outputs zero. Leaky ReLU is used to fix the “dying ReLU” problem since it does not contain zero slope parts. It also makes the process of training faster.

6 Results and discussions

6.1 Experimental setup

The proposed SS-WOA-CNN-based plant leaf classification for the trained as well as the untrained data was implemented in Python with Google Colab and the results were carried out. The population size was considered as 10 and the maximum iteration count was 25. The training percentage was varied from 40 to 80% for analysis in results section. If 70% of data was used for training, and 30% of data was used for testing. As mentioned earlier, the Dataset 1 is used for training and testing, and Dataset 2 is used for only testing as the untrained data classification is the challenge here. Hence, the Dataset 1 performs well as it is taken for both training and testing. The proposed SS-WOA-CNN was differentiated with several machine learning algorithms like NB (Fang Oct. 2013), SVM (ShuangYu October 2015), NN (Fernández-Navarro et al. 2017), DNN (Jen-Tzung 2019), CNN (Rawat and Wang June 2017), and optimization algorithms like PSO-CNN (Pedersen and Chipperfield 2010), GWO-CNN (Seyedali et al. 2014), WOA-CNN (Mirjalili and Lewis 2016), and SSO-CNN (Failed 2014) in terms of Type I or positive measures like, “accuracy, sensitivity, specificity, precision, NPV, F1 Score, and MCC”, and Type II or negative measures such as, “FPR, FNR, and FDR” to determine its superiority in classifying the plant leaves for the trained as well as the untrained data.

6.2 Performance metrics

The various performance measures for classifying the plant leaves for the trained as well as the untrained data is listed below.

  1. a.

    Accuracy: It is described in Eq. (8).

  2. b.

    Specificity: “the number of true negatives, which are determined precisely”.

    $$Spe = \frac{{trn}}{{fap}}$$
    (21)
  3. c.

    FPR: “the ratio of count of false positive predictions to the entire count of negative predictions”.

    $$FPR = \frac{{fap}}{{fap + trn}}$$
    (22)
  4. d.

    NPV: “probability that subjects with a negative screening test truly don't have the disease”.

    $$NPV = \frac{{fan}}{{fan + trn}}$$
    (23)
  5. e.

    F1 score: “harmonic mean between precision and recall. It is used as a statistical measure to rate performance”.

    $$F1score = \frac{{Sen \bullet \Pr e}}{{\Pr e + Sen}}$$
    (24)
  6. f.

    Sensitivity: “the number of true positives, which are recognized exactly”.

    $$Sen = \frac{{trp}}{{trp + fan}}$$
    (25)
  7. g.

    Precision: “the ratio of positive observations that are predicted exactly to the total number of observations that are positively predicted”.

    $$\Pr e = \frac{{trp}}{{trp + fap}}$$
    (26)
  8. h.

    FNR: “the proportion of positives which yield negative test outcomes with the test”.

    $$FNR = \frac{{fan}}{{trn + trp}}$$
    (27)
  9. i.

    MCC: “correlation coefficient computed by four values”.

    $$MCC = \frac{{trp \times trn - fap \times fan}}{{\sqrt {\left( {trp + fap} \right)\left( {trp + fan} \right)\left( {trn + fap} \right)\left( {trn + fan} \right)} }}$$
    (28)
  10. j.

    FDR: “the number of false positives in all of the rejected hypotheses”.

    $$FDR = \frac{{fap}}{{fap + trp}}$$
    (29)

6.3 Trained and untrained classification

The effect of classifying the plant leaves by the developed and existing machine learning models as well as the heuristic-based CNN for the trained and the untrained data is described in Figs. 5 and 6. Here, for trained 15 class labels, the accuracy is more as in Figs. 5 and 6. But for the untrained data (others), the accuracy seems to be less when compared with the trained data that is considered as a challenging task. From Fig. 5a, at 6th class label, the accuracy of the developed SS-WOA-CNN is 6.67% advanced than NB, 9.09% higher than SVM, 5.49% higher than NN, 6.67% higher than DNN, and 4.35% higher than CNN. Moreover, for the 14th class label from Fig. 5b, the accuracy of the presented SS-WOA-CNN is 6.52% surpassed than NB, 5.38% surpassed than SVM, 4.26% surpassed than NN, 6.52% surpassed than DNN, and 5.38% surpassed than CNN. Further, while considering the untrained data (others), the accuracy of the presented SS-WOA-CNN is 7.14% improved than NB, 5.63% progressed than SVM, 4.17% progressed than NN, 4.17% progressed than DNN, and 1.35% progressed than CNN. While see Fig. 6a, for the 8th class label, the accuracy of the developed SS-WOA-CNN is 2.08% progressed than PSO-CNN, 2.08% progressed than GWO-CNN, 3.16% advanced than WOA-CNN, and 1.03% advanced than SSO-CNN. Similarly, in Fig. 6b, for the untrained label (others), the accuracy of the proposed SS-WOA-CNN is 1.45% superior to PSO-CNN, 1.45% better than GWO-CNN, 2.94% better than WOA-CNN, and 4.48% better than SSO-CNN. Hence, better classification of plant leaves is provided by the proposed SS-WOA-CNN against various conventional machine learning models and heuristic-based CNN in terms of both the trained and the untrained data.

Fig. 5
figure 5

Effect of proposed and conventional machine learning models for plant leaf classification on trained and untrained data a Class 1 to Class 8, and b Class 9 to Class 15 with untrained data (others)

Fig. 6
figure 6

Effect of proposed and conventional heuristic-based CNN for plant leaf classification on trained and untrained data a Class 1 to Class 8, and b Class 9 to Class 15 with untrained data (others)

6.4 Performance analysis of heuristic-based CNN

The performance analysis of the presented and traditional heuristic-oriented CNN for the plant leaf classification on trained and untrained data with respect to various measures for various learning percentages is depicted in Fig. 7. It can be seen that the positive measures return an increased outcome and negative measures return a less outcome, which determines the superiority of the proposed SS-WOA-CNN in classifying the plant leaves for the trained as well as the untrained data. From Fig. 7a, the accuracy of the presented SS-WOA-CNN at 85% learning percentage is 0.83% upgraded than PSO-CNN, 0.81% upgraded than GWO-CNN, 1.33% upgraded than WOA-CNN, and 1.60% upgraded than SSO-CNN. In Fig. 7b, at a learning percentage of 65%, the sensitivity of the developed SS-WOA-CNN is 8.94% surpassed than PSO-CNN, 10.74% surpassed than GWO-CNN, 5.51% surpassed than WOA-CNN, and 14.20% surpassed than SSO-CNN. On considering Fig. 7c, at a learning percentage of 75%, the specificity of the proposed SS-WOA-CNN is 0.63% higher than PSO-CNN, 0.80% higher than GWO-CNN, 0.62% higher than WOA-CNN, and 0.59% higher than SSO-CNN. While seeing Fig. 7d, the precision of the proposed SS-WOA-CNN at 85% learning percentage is 7.88% advanced than PSO-CNN, 7.73% advanced than GWO-CNN, 11.915 advanced than WOA-CNN, and 15.43% advanced than SSO-CNN. In Fig. 7e, the FPR of the proposed SS-WOA-CNN for 75% learning percentage is 31.63% upgraded than PSO-CNN, 35.58% upgraded than GWO-CNN, 31.28% upgraded than WOA-CNN, and 28.72% upgraded than SSO-CNN. On seeing Fig. 7f, at 65% learning percentage, the FNR of the proposed SS-WOA-CNN is 25.76% improved than PSO-CNN, 27.41% improved than GWO-CNN, 17.65% improved than WOA-CNN, and 34.23% improved than SSO-CNN. At a learning percentage of 75% from Fig. 7g, the NPV of the proposed SS-WOA-CNN is 0.60% progressed than PSO-CNN, 0.78% progressed than GWO-CNN, 0.59% upgraded than WOA-CNN, and 0.57% upgraded than SSO-CNN. In Fig. 7h, at 85% learning percentage, the FDR of the proposed SS-WOA-CNN is 24% higher than PSO-CNN, 20.83% higher than GWO-CNN, 32.14% higher than WOA-CNN, and 36.67% higher than SSO-CNN. On considering Fig. 7i, at 65% learning percentage, the F1 Score of the developed SS-WOA-CNN is 8.94% upgraded than PSO-CNN, 11.36% upgraded than GWO-CNN, 7.49% upgraded than WOA-CNN, and 11.98% upgraded than SSO-CNN. Moreover, on considering Fig. 7j, at 85% learning percentage, the MCC of the proposed SS-WOA-CNN is 8.67% surpassed than PSO-CNN, 8.09% surpassed than GWO-CNN, 13.92% surpassed than WOA-CNN, and 17.60% surpassed than SSO-CNN. Thus, it is clear that the proposed SS-WOA-CNN does better performance analysis than the existing heuristic-based CNN in classifying the plant leaves for both the trained as well as the untrained data.

Fig. 7
figure 7figure 7

Performance analysis of proposed and conventional heuristic-based CNN for plant leaf classification on trained and untrained data concerning metrics a accuracy, b sensitivity, c specificity, d precision, e FPR, f FNR, g NPV, h FDR, i F1 score, j MCC

6.5 Performance analysis of machine learning

The proposed and conventional machine learning models for classifying the plant leaves on trained as well as untrained data by concerning the various measures are portrayed in Fig. 8. The outcomes revealed the betterment of the proposed SS-WOA-CNN. From Fig. 8a, at a learning percentage of 75%, the accuracy of the proposed SS-WOA-CNN is 4.40% surpassed than NB, 3.26% surpassed than SVM, 1.06% surpassed than NN, 2.15% surpassed than DNN, and 1.06% surpassed than CNN. On considering Fig. 8b at 85% learning percentage, the sensitivity of the proposed SS-WOA-CNN is 59.62% advanced than NB, 50.91% advanced than SVM, 27.69% progressed than NN, 18.57% progressed than DNN, and 10.67% progressed than CNN. At a learning percentage of 65% in Fig. 8c, the specificity of the proposed SS-WOA-CNN is 5.43% upgraded than NB, 4.30% upgraded than SVM, 2.11% upgraded than NN, 3.19% upgraded than DNN, and 2.11% upgraded than CNN. From Fig. 8d, at 85% learning percentage, the precision of the proposed SS-WOA-CNN is 48.15% improved than NB, 37.93% improved than SVM, 21.21% improved than NN, 26.98% progressed than DNN, and 2.56% progressed than CNN. In Fig. 8e, at 75% learning percentage, the FPR of the proposed SS-WOA-CNN is 57.58% progressed than NB, 56.25% improved than SVM, 36.36% improved than NN, 48.15% improved than DNN, and 30% progressed than CNN. At 65% learning percentage in Fig. 8f, the FNR of the developed SS-WOA-CNN is 60% superior to NB, 61.70% higher than SVM, 45.45% higher than NN, 43.75% higher than DNN, and 28% higher than CNN. From Fig. 8g at 75% learning percentage, the NPV of the presented SS-WOA-CNN is 2.13% upgraded than NB, 2.13% upgraded than SVM, and DNN, and 1.05% upgraded than NN and CNN. In Fig. 8h, at a learning percentage of 85%, the FDR of the proposed SS-WOA-CNN is 60% surpassed than NB, 55% surpassed than SVM, 40% surpassed than NN, 47.06% surpassed than DNN, and 5.26% surpassed than CNN. In Fig. 8i, at 75% learning percentage, the F1 Score of the proposed SS-WOA-CNN is 66.67% higher than NB, 42.86% higher than SVM, 14.29% advanced than NN, 25% advanced than DNN, and 8.11% advanced than CNN. Further, from Fig. 8j, at 65% learning percentage, the MCC of the proposed SS-WOA-CNN is 62.5% advanced than NB, 59.18% advanced than SVM, 23.81% upgraded than NN, 25.81% upgraded than DNN, and 11.43% upgraded than CNN. Therefore, the proposed SS-WOA-CNN performs better performance analysis than the traditional machine learning models in classifying the plant leaves for the trained as well as the untrained data.

Fig. 8
figure 8figure 8

Performance analysis of proposed and conventional machine learning models for plant leaf classification on trained and untrained data concerning metrics a Accuracy, b sensitivity, c specificity, d precision, e FPR, f FNR, g NPV, h FDR, i F1 score, j MCC

6.6 Overall analysis

The overall analysis of the proposed and conventional heuristic-based CNN and the machine learning models in classifying the plant leave on the trained and untrained data is listed in Tables 4 and 5. The SS-WOA-CNN has the capability for avoiding the local optima and for getting the global optimal solution. It has the efficiency to solve the constrained or unconstrained issues for the real applications. Due to these advantages the SS-WOA-CNN performs better than the other existing methods. The positives measures show an enhanced result and the negative measures show a decreased result, thereby proving the superiority of the proposed SS-WOA-CNN. From Table 4, the accuracy of the proposed SS-WOA-CNN is 0.86%, 0.78%, 1.28%, and 1.53% advanced than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The sensitivity of the proposed SS-WOA-CNN is 8.59%, 6.92%, 13.93%, and 15.83% upgraded than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The specificity of the proposed SS-WOA-CNN is 0.43%, 0.43%, 0.61%, and 0.78% improved than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The precision of the proposed SR-SSO-WOA is 7.96%, 7.54%, 11.95%, and 15.16% progressed than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The FPR of the developed SS-WOA-CNN is 23.25%, 23.25%, 29.79%, and 35.29% superior to PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The FNR of the developed SS-WOA-CNN is 27.50%, 23.68%, 36.96%, and 39.58% upgraded than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The NPV of the proposed SS-WOA-CNN is 0.43%, 0.43%, 0.61%, and 0.78% surpassed than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The FDR of the presented SS-WOA-CNN is 23.70%, 22.81%, 31.01%, and 35.67% higher than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The F1 Score of the proposed SS-WOA-CNN is 8.27%, 7.24%, 12.93%, and 15.49% advanced than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. The MCC of the proposed SS-WOA-CNN is 9.08%, 7.94%, 14.24%, and 17.12% better than PSO-CNN, GWO-CNN, WOA-CNN, and SSO-CNN. On considering Table 5, the accuracy of the proposed SS-WOA-CNN is 4.02%, 3.23%, 1.95%, 2.12%, and 0.57% progressed than NB, SVM, DNN, NN, and CNN. The sensitivity of the proposed SS-WOA-CNN is 59.77%, 0.87%, 28.70%, 19.83%, and 13.01% progressed than NB, SVM, DNN, NN, and CNN. The specificity of the developed SS-WOA-CNN is 1.89%, 1.40%, 0.69%, 1.22%, and 0.09% superior to NB, KNN, DNN, NN, and CNN. The precision of the presented SS-WOA-CNN is 51.41%, 36.70%, 17.48%, 23.31%, and 1.18% upgraded than NB, SVM, DNN, NN, and CNN. The FPR of the proposed SS-WOA-CNN is 56.58%, 49.23%, 32.65%, 45.90%, and 6.46% surpassed than NB, SVM, DNN, NN, and CNN. The FNR of the developed SS-WOA-CNN is 64.20%, 60.81%, 51.67%, 44.23%, and 35.56% higher than NB, SVM, DNN, NN, and CNN. The NPV of the developed SS-WOA-CNN is 1.89%, 1.40%, 0.69%, 1.22%, and 0.09% advanced than NB, SVM, DNN, NN, and CNN. The FDR of the developed SS-WOA-CNN is 58.85%, 53.07%, 38.53%, 44.33%, and 4.69% better than NB, SVM, DNN, NN, and CNN. The F1 Score of the presented SS-WOA-CNN is 55.54%, 42.22%, 23.03%, 21.59%, and 7.03% improved than NB, SVM, DNN, NN, and CNN. Moreover, the MCC of the proposed SS-WOA-CNN is 63.38%, 47.43%, 25.33%, 24.03%, and 7.45% progressed than NB, SVM, DNN, NN, and CNN respectively. Hence, it could be confirmed that the proposed SS-WOA-CNN returns better analysis in classifying the plant leaves for the trained as well as the untrained data when it is differentiated with the existing machine learning and heuristic-based CNN methods.

Table 4 Overall Analysis of proposed and conventional heuristic-based CNN for plant leaf classification on trained and untrained data
Table 5 Overall analysis of proposed and conventional machine learning models for plant leaf classification on trained and untrained data

7 Conclusion

This paper has introduced a new deep learning-based plant leaf classification model. The experiment was done by collecting the publicly available standard datasets called the “Swedish leaf dataset” and the “Mendeley data”. The RGB to gray scale conversion, histogram equalization, and median filtering were increases the image quality. Moreover, the leaf classification was done by the optimized CNN by optimizing its hidden neurons and activation function by the proposed SS-WOA that attained the maximum classification accuracy. The optimized threshold-based CNN classification was performed for handling the untrained images. This method has observed the classification score instead of the classification label. The threshold was fixed on the basis of trial and error pattern concerning the optimization. The hybrid SS-WOA optimized the threshold value that obtained the maximum classification accuracy for the untrained data. The classification of the images was done on the basis of the high classification score and the range greater than threshold value, thus enhances the performance in handling both trained and untrained data. The proposed model was compared to the diverse traditional machine learning models, which has proved the efficiency of the proposed model. From the analysis, the accuracy of the developed SS-WOA-CNN was 0.86%, 0.78%, 1.28%, and 1.53% better than PSO-CNN, GWO-CNN, WOA-CNN, SSO-CNN, and 4.02%, 3.23%, 1.95%, 2.12%, and 0.57% better than NB, SVM, DNN, NN, and CNN, respectively. Thus, it could be demonstrated that the proposed SS-WOA-CNN performed better outcomes in the plant leaf classification for the trained as well as the untrained data. Anyhow, the limitation of the proposed system is that the CNN need to be trained if the available images are lower than the required images.