Keywords

1 Introduction

With acceleration of the extinction rate of plant species, it is essential urgent to protect plants [1, 2]. A preliminary task is the classification of plant type, which is challenging, complex and time-consuming. As we all know, plants includes the flowering plants, conifers and other gymnosperms and so on. Most of them do not blooming and have fruit, but almost all of them contain leaves [3]. Therefore, in this paper we focus on feature extraction and classification of leaves.

Recent studies are analyzed below: Heymans, Onema and Kuti [4] proposed a neural network to distinguish different leaf-forms of the opuntia species. Wu, Bao, Xu, Wang, Chang and Xiang [5] employed probabilistic neural network (PNN) for leaf recognition system. Wang, Huang, Du, Xu and Heutte [6] classified plant leaf images with complicated background. Jeatrakul and Wong [7] introduced Back Propagation Neural Network (BPNN), Radial Basis Function Neural Network (RBFNN), Probabilistic Neural Network (PNN) and compared the performances of them. Dyrmann, Karstoft and Midtiby [8] used convolutional neural network (CNN) to classify plant species. Zhang, Lei, Zhang and Hu [9] employed semi-supervised orthogonal discriminant projection for plant leaf classification.

Although these methods have achieved good results, ANN has the higher accuracy and less time-consuming in classification than other approaches. Therefore, in this paper, we employ the BPNN algorithm on the leaves of the classification of automatic identification.

Our contribution in this paper includes: (i) We proposed a five-step preprocessing method, which can remove unrelated information of the leaf image. (ii) We developed a leaf recognition system.

2 Methodology

2.1 Pretreatment

We put a sheet of glass over the leaves, so as to unbend the curved leaves. We pictured all the leaves indoor (put the leaves on the white paper) using a digital camera with (Canon EOS 70D) by two cameraman with experience over five years. The pose of the camera is fixed on the tripod during the imaging. Two light-emitting diode (LED) lights are hanged 6 inches over the leaves. Those images out of focus are removed.

2.2 Image Preprocessing

During the imaging, the background information is captured, which will inference the detection of the leaves. Therefore, the necessary preprocessing is to remove the irrelevant background and color channels.

First, we suppose texture is related to leave category, and color information is of little help. Hence, we removed the background of the image. Second, we convert the RGB image into a grayscale image. Table 1 shows the steps of image preprocessing.

Table 1. Steps of image preprocessing

Figure 1(a) shows the original image. Figure 1(b) shows the image of background with black color. Figure 1(c) shows the grayscale image

Fig. 1.
figure 1

Illustration of leaves

2.3 Feature Extraction

Fourier transform (FT) [10] is a method of signal analysis, which decomposes the continuous signal into harmonic waves with different frequency [11]. It can more easily deal with the original signal than traditional methods. However, it has limitations for non-stationary processes, it can only get a signal which generally contains the frequency, but the moment of each frequency is unknown. A simple and feasible handling method is to add a window. The entire time domain process is decomposed into a number of small process with equal length, and each process is approximately smooth, then employing the FT can help us to get when each frequency appears. We call it as Short-time Fourier Transform (STFT) [12, 13].

The drawback of STFT is that we do not know how to determine the size of windowing function. If the window size is small, the time resolution will be good and the frequency resolution will be poor. On the contrary, if the window size is large, the time resolution will be poor and the frequency resolution will be good.

Wavelet transform (WT) [14,15,16] has been hailed as a microscope of signal processing and analysis. It can analyze non-stationary signal and extract the local characteristics of signal. In addition, wavelet transform has the adaptability for signal in signal processing and analysis [17], therefore, it is a new method of information processing which is superior to the FT and STFT.

In this paper, the feature extraction [18, 19] of the original leaf images is carried out by 2-level wavelet transform, which it decompose the leaves images with the low-frequency and high-frequency coefficients. However, a massive of features not only increase the cost of computing but also consume much storage memory does little to classification [20]. We need to take some measures to select important feature. Entropy [21, 22] is used to measure the amount of information of whole system in the information theory and also could represent the texture of the image.

$$ \begin{array}{*{20}l} {\left[ \begin{aligned} X \hfill \\ p(x) \hfill \\ \end{aligned} \right] = \left[ {\begin{array}{*{20}c} {x_{1} } & {x_{2} } & {x_{3} } & \ldots & {x_{n} } & {x_{n + 1} } \\ {p_{1} } & {p_{2} } & {p_{3} } & \ldots & {p_{n} } & {p_{n + 1} } \\ \end{array} } \right],} \hfill \\ {0 \le p_{i} \le 1,\sum\limits_{i = 1}^{n + 1} {p_{i} } = 1} \hfill \\ \end{array} $$
(1)

where X is a discrete and random variable, p(x) is the probability mass function, the amount of information contained in a message signal \( x_{i} \) can be expressed as

$$ I(x_{i} ) = - \log p_{i} $$
(2)

where \( I(x_{i} ) \) is a random variable, it can’t be used as information measure for the entire source [23], Shannon, the originator of modern information theory, defines the average information content of \( X \) as information entropy [24, 25]:

$$ H(X) = \text{E} [I(x_{i} )] = - \sum {p_{i} } \log p_{i} $$
(3)

where \( \text{E} \) is the excepted value operator, \( H \) represents entropy. Figure 2 shows the entropy of source.

Fig. 2.
figure 2

The entropy of source

There are seven comprehensive indexes (as shown in Fig. 3) are obtained after adopting 2-level WT, with four in size of \( 50 \times 50 \) and three 100 × 100. The entropy of these matrices is calculated as the input to the following BP neural network (BP).

Fig. 3.
figure 3

2-level WT

2.4 Back-Propagation Neural Network

Back Propagation algorithm is mainly used for regression and classification. It is one of the most widely used training neural networks model. Figure 4 represents the diagram of BP, the number of nodes in the input layer and output layer is determined but the number of nodes in the hidden layer is uncertain. There is an empirical formula can help to determine the number of hidden layer nodes, as follow

Fig. 4.
figure 4

The diagram of BP

$$ t = \sqrt {r + s} + c $$
(4)

where \( t,r,s \) represents the number of nodes in the hidden layer, input layer, and output layer, respectively. \( c \) is an adjustment constant between zero and ten.

In our experiment, the features of leaves were reduced to seven after feature selection, i.e. \( r = 7 \). Meanwhile, there is one output unit, which stands for the predictable result, so \( s\, = \,1 \). Because \( c \) is constant, according to formula (4), we set \( t = 15 \).

BPNN algorithm is part of a supervised learning method. The following is the main ideas of the BP algorithm learning rule:

Known vectors: input learning samples \( \{ P^{1} ,P^{2} , \ldots ..,P^{q} \} \), the corresponding output samples \( \{ T^{1} ,T^{2} , \ldots .,T^{q} \} \).

Learning objectives: The weights are modified with the error between the target vector \( \{ T^{1} ,T^{2} , \ldots .,T^{q} \} \) and the actual output of the network \( \{ A^{1} ,A^{2} , \ldots ..,A^{q} \} \) in order to \( A^{i} \,(i = 1,2, \ldots ,q) \) is as close as possible to the excepted \( T^{i} \), i.e., error sum of squares of the network output layer is minimized.

BPNN algorithm has two parts: the forward transfer of working signal and error of the reverse transfer. In the forward propagation process, the state of each layer of neurons only affects the state of the next layer of neurons.

$$ x_{j} = \mathop \sum \limits_{i} w_{ij} P^{i} $$
(5)

where \( x_{j} \) represents the input of hidden layer. \( w_{ij} \) is the weight between input layer and the hidden layer.

$$ x_{j}^{'} = f(x_{j} ) = 1/(1 + e^{{( - x_{j} )}} ) $$
(6)

Equation (6) indicates the function of hidden layer.

$$ A^{i} = \mathop \sum \limits_{j} w_{jk} x_{j}^{{\prime }} $$
(7)

\( {\text{A}}^{i} \) is the actual output of output layer.

$$ e = \frac{1}{2}\mathop \sum \limits_{i} \left( {A^{i} - T^{i} } \right)^{2} $$
(8)

where \( T^{i} \) is the desired output.

If the desired output \( T^{i} \) is not obtained at the output layer, we will use the Eq. (8) and the error value will be calculated, then the error of the reverse transfer is carried out.

2.5 Implementation of the Proposed Method

The aim of this study was to distinguish the type of leaves with high classification accuracy. The steps of proposed system are sample collection, pretreatment, preprocessing, feature extraction, classifier classification (Fig. 5). Table 2 shows the pseudocode of the proposed system.

Fig. 5.
figure 5

Pipeline of our proposed method

Table 2. Pseudocode of the proposed system

2.6 Five-Fold Cross Validation

The training criterion of the standard BP neural network is to require that the error sum of squares (or the fitting error) of the expected and output values of all samples is less than a given allowable error \( \upvarepsilon \). In general, the smaller the value of \( \upvarepsilon \) is, the better the fitting accuracy. Nevertheless, for the actual application, the prediction error decreases with the decrease of the fitting error at the first. However, when the fitting error on training set decreases to a certain value, the prediction error on test set increases, which indicates that the generalization ability decreases. This is the “over-fitting” phenomenon which the BPNN modeling encountered. In this paper, cross-validation is used to prevent over-fitting.

The basic of the cross validation method is that the original dataset of the neural network were divided into two parts: the training set and the validation set. First of all, the classifier was trained by training set, and then test the training model using the validation set, through the above results to evaluate the performance of the classifier.

The steps of K-fold cross validation:

  1. Step 1.

    All training set \( {\text{S}} \) is divided into \( k \) disjoint subsets, suppose the number of training examples in S is m, then the training examples of each subset is m/k, the corresponding subset is called \( \left\{ {{\text{s}}_{1} ,s_{2} , \ldots .,s_{k} } \right\} \).

  2. Step 2.

    A subset is selected as test set from subset \( \left\{ {{\text{s}}_{1} ,s_{2} , \ldots .,s_{k} } \right\} \) each time, and the other (k − 1) as the training set.

  3. Step 3.

    Training model or hypothesis function is obtained through training.

  4. Step 4.

    Put the model into the test set and gain the classification rate.

  5. Step 5.

    Calculate the average classification rate of the k-times and regard average as the true classification rate of the model or hypothesis function.

3 Experiment and Discussions

The method of BP is implemented in MATLAB R2016a (The Mathworks, Natick, MA, USA). This experiments were accomplished on a computer with 3.30 GHz core and 4 GB RAM, running under the Window 8 and based on 64-bit processors.

3.1 Database

We used leaves as the object of study, which are all size of \( 200 \times 200 \) and in jpg format. The input dataset contains 90 images, which 30 of ginkgo biloba, 30 of Phoenix tree leaf and 30 of Osmanthus leaves. As mentioned above, we use the 5-fold cross validation to prevent the case of over-fitting, Thence, we ran one trials with each 72 (25 ginkgo biloba and 26 Osmanthus and 21 Phoenix tree) are used for training and the left 18 (5 ginkgo biloba and 4 Osmanthus and 9 Phoenix tree) are used for test. Figure 6 show one trail of 5-fold cross-validation.

Fig. 6.
figure 6

One trail of five-fold cross-validation

3.2 Algorithm Comparison

The wavelet-entropy features were fed into different classifiers. We compared our method (BPNN) with Medium KNN [26], Coarse Gaussian SVM [27], Complex Tree [28], Cosine KNN [29]. The results of comparing were shown in Table 3.

Table 3. Comparison with different methods

Results in Table 3 are the central contribution of our experiment, we can see that the accuracy of our method achieves 90.0%, and it is better than other methods. Why BPNN performs the best among the fiver algorithms? The reason lies in the universal approximation theorem proven in reference [30].

4 Conclusion and Future Research

This paper introduced a novel automatic classify method for the leaves images. With the combination of BP neural network and wavelet-entropy, the accuracy of our method achieves 90.0%.

It costs 1.2064 s to finish feature extraction for 90 leaves images and the average time for each image is 0.0134 s.

Nevertheless, there are still several problems remaining unsolved: (1) we try to improve the accuracy of the algorithm; (2) we can employ other methods of feature extraction in order to decrease the time of extraction; (3) we may extend this method to other type of images, such as car image, tree image, etc.