Keywords

1 Introduction

Leaves of plant have become an interesting application of pattern recognition and classification. Many researches adopted leaf descriptors to identify plant species automatically. Colour is one of the most widely used low-level visual descriptor and is invariant to image size and orientation. With the selection of colour descriptor, the underlying colour space is also greatly helpful. Colour histogram is invariant to orientation and scale. Such unique characteristics make colour and colour histogram powerful in image recognition and classification. Based on such features, various approaches have been proposed to recognize different kinds of leaves in image. However, most of these approaches suffered from poor performance (e.g., low accuracy rate) and more or less failed in practice.

The RGB space was used by many researchers to extract the colour features in an image. Unfortunately, the RGB is not well suited in colour description for human interpretation and thus, daily practice [18]. In [11] the research recognises the leaf shape using Centroid Contour Distance (CCD) as shape descriptor. Centroid Contour Distance is a contour-based approach to represent image shapes only exploiting only boundary information. The approach calculates the distance between the midpoint and the points on the edge to the corresponding interval angle. Several leaf classification systems have incorporated texture features to improve the performance, such as that in [21] used entropy, homogeneity and contraction derived from digital wavelet transform (DWT). With the success, the approach has become one of the most important and powerful tools in image processing. Kadir et al. [17] used Polar Fourier Transform and three different kinds of geometric features to represent shapes. They also adopted statistical methods such as mean, standard deviation, and skewness to represent colour features and extracted texture features from GLCMs, and creatively added vein features into leaf identification in order to improve performance. Furthermore, Zulkifli et al. [32] compared the effectiveness of Zernike Moment Invariant (ZMI), Legendre Moment Invariant (LMI) and Tchebichef Moment Invariant (TMI) in feature extraction from leaf images. The features extracted by using the most effective moment invariant technique are then adopted to help classifying images using the General Regression Neural Network (GRNN). Combination of features derived from shape, vein, colour and texture of leaf images is also proposed with PCA to convert the features into orthogonal features in leaf identification system [16]. Another similar attempt was done by Liu et al. [20], who used combination of texture features and shape features for identification and used deep belief networks (DBNs) as the classifier. Texture features are derived from local binary patterns, Gabor filters and grey level co-occurrence matrix while shape feature vector is modelled using Hu Moment invariants and Fourier descriptors.

In this paper, we present a method for recognising the specie of plants from their leafs. In the work, local characteristic features like LBP is first extracted after decomposition by Haar wavelet. The features are combined with global features such as HSV colour histogram for yields promising performance with high level of accuracy and precision. The proposed method has achieved recognition performance and outperformed baseline models representing state-of-the-art leaf image recognition approaches. The work makes clear, significant contribution to knowledge advancement in leaf image classification and recognition of plant species.

The rest of this paper is organised as follows. The related work will be discussed in Sect. 2. The research problem will be formally defined in Sect. 3, followed by the proposed HD method delivered in Sect. 4. After that, the experimental evaluation and the results will be discussed in Sect. 5. Finally, Sect. 6 will conclude the paper and highlight the future work.

2 Related Work

Many works have been conducted on plant specie identification relying on leaf recognition. After using the histogram equalisation and ROI segmentation for enhancing, VijayaLakshmi and Mohan [27] used the Haralick Texture with Gabor and Shape Based Features such as area, centroid and orientation with colour features after convert RGB colour to HSV colour system. Finally they used the Fuzzy Relevance Vector Machine (FRVM) to characterise the type of leaves.

Alternatively, Du et al. [4] extracted Digital Morphology Features (DMF) from the contours of leaves. The DMF generally included Geometrical Features (GF) and invariable Moment Features (MF). Such features were then used with Move Median Centers (MMC) to train an effective classifier. Another similar work was completed by Wang et al. [28], who also used MMC and shape features (Geometric Features and Hu moment) to recognize leaf images. In [2], Gabor wavelet filters were exploited to extract texture filters in a foliar surface to improve the performance of plant classification.

Another great achievement was done by Pornpanomchai et al. [25]. They built a system to recognize Thai herb leaves in images (THLI). The system extracted 13 different kinds of features from the leaf images and then employed k-nearest neighbour (k-NN) in the recognition process. In [31], Wang and colleagues introduced Pulse-coupled neural network (PCNN), a new artificial neural network model for feature extraction. They firstly extracted leaf features by using PCNN, and then classified images by Support Vector Machines (SVMs). Leaves can also be classified using their structural properties. For instance, a leaf usually consists of triangular pieces that protrude around a polygon. Taking advantage of such structural properties, Im et al. [15] classified leaf images adopting statistical methods, along with variations of leaf contours.

The texture features of leaves have also been used in many works for leaf recognition. Ehsanirad et al. [6] extracted the Gray-Level Co-occurrence matrix (GLCM) and used Principal Component Analysis (PCA) algorithm to classify plants relying on leaf images. Furthermore, Gu et al. [8] attempted to recognise leaves using segmentation of a leaf’s skeleton based on the combination of wavelet transform Gaussian interpolation. They also used k-nearest neighbor (\(k=1\)) combined with a radial basis probabilistic neural network (RBPNN) as the classifier to recognize leaves on the basis of run-length features extracted from the skeleton.

Ahmed and colleagues [1] used comparison table between different methods for identification and classification of leaf images, with deep analysis for advantages and disadvantages in different methods, respectively. In [19], Liu and colleagues used wavelet decomposition and local characteristic of LBP to extract features for face recognition. For different practices, Du et al. [5] used wavelet domain local binary pattern features for writer identification. In a different work, Handa and Agarwal [10] compared different algorithms used in plant classification based on leaf recognition images and the accuracy for each one.

These related work, however, still have room for improvement because they considered either global or local features. A hypothesis of having a hybrid descriptor considering both the global and local features then motivated us in the work presented in the paper.

3 Research Problem

Let \(\mathcal {IMG}=\{img_\imath \in \mathbb {IMG}, \imath =1, \dots , m\}\) be a set of images; \(\mathcal {C}=\{c_1, \dots , c_K\}\) be a set of classes, where \(K=|\mathcal {C}|\). Assuming there is available a training set \(\mathcal {IMG}_t=\{img_\jmath \in \mathbb {IMG}, \jmath =m+1, \dots , n\}\) with \(y^{k}_\jmath =\{0,1\}, k=1, \dots , K\) provided for describing the likelihood of \(img_\jmath \) belonging to class \(c_k\), our research problem is how to learn an efficient binary prediction function \(f(y^k|img)\) and use it to classify \(img_\imath \in \mathcal {IMG}\).

4 Descriptors

An innovative method, namely Hybrid Descriptors, is proposed in this paper tackling the research problem defined in Sect. 3. The proposed method consists of three components: global feature extraction, local feature extraction, and hybrid descriptor generation for final classification of images \(\mathcal {IMG}\). In this section, we will introduce the proposed method in detail. For the sake of easy discussion, we will refer to Hybrid Descriptor by just HD.

4.1 Global Feature Extraction

Global features are the set of features extracted from the whole image. Global features have been widely used with success in image classification. A typical global feature is colour histogram [9]. In this work, we used colour histogram as the global feature despite the visual difference between the gradients in plant leaves.

The RGB is a commonly used colour system, and is ideally suited for hardware implementation such as colour monitors. Unfortunately, the RGB is not well suited to specify colours because it is not practical for human interpretation. Contrary, the HSV (hue, saturation, value) model is an ideal tool for developing image processing algorithms based on colour descriptions. The HSV is deemed more natural and intuitive to humans [3, 18]. For such a reason, images are normally converted from an RGB space to HSV colour space by using the following equations [23]:

$$\begin{aligned} H=COS^{-1}\frac{\frac{1}{2}[(R-G)+(R-B)]}{\root \of {(R-G)^2+(R-B)(G-B)}} \end{aligned}$$
(1)
$$\begin{aligned} S=1- \frac{3}{R+G+B}(min(R,G,B)) \end{aligned}$$
(2)
$$\begin{aligned} V=\frac{1}{3}(R+G+B) \end{aligned}$$
(3)

The RGB represent red, green and blue components respectively with value between 0–255. In order to obtain the value of H from 0\({^\circ }\) to 360\({^\circ }\) and the value of S and V between 0 and 1, the following Equations are executed [23]:

$$\begin{aligned} H=(\frac{H}{255}\times 360)\%360 \end{aligned}$$
(4)
$$\begin{aligned} V=V/255 \end{aligned}$$
(5)
$$\begin{aligned} S=S/255 \end{aligned}$$
(6)

where % is the modular operator that gives the reminder after dividing \((\frac{H}{255}\times 360)\) by 360.

Generally, for a given colour image, the number of actual colours only occupies a small proportion of the total number of colours in the entire colour space. Therefore, the hue component, which represents the colour information, is uniformly divided into eight coarse partitions. Similarly, the saturation and intensity components are divided into three partitions, respectively. Consequently, the global colour histogram can be calculated as follows [18]:

$$\begin{aligned} H= \left\{ \begin{array}{lll} 0, &{}\;\;\;&{} \text{ if } \; h\in [316,360] \\ 1, &{}\;\;\;&{} \text{ if } \; h\in [1,45]\\ 2, &{}\;\;\;&{} \text{ if } \; h\in [46,90]\\ 3, &{}\;\;\;&{} \text{ if } \; h\in [91,135]\\ 4, &{}\;\;\;&{} \text{ if } \; h\in [136,180]\\ 5, &{}\;\;\;&{} \text{ if } \; h\in [181,225]\\ 6, &{}\;\;\;&{} \text{ if } \; h\in [226,270]\\ 7, &{}\;\;\;&{} \text{ if } \; h\in [271,315]\\ \end{array} \right. \end{aligned}$$
(7)
$$\begin{aligned} S= \left\{ \begin{array}{lll} 0, &{}&{} \text{ if } \; s\in [0,0.3) \\ 1, &{}\;\;\;&{} \text{ if } \; s\in [0.3,0.7]\\ 2, &{}&{} \text{ if } \; s\in (0.7,1]\\ \end{array} \right. \end{aligned}$$
(8)
$$\begin{aligned} V= \left\{ \begin{array}{lll} 0, &{}&{} \text{ if } \; v\in [0,0.3) \\ 1, &{}\;\;\;&{} \text{ if } v\in [0.3,0.7]\\ 2, &{}&{} \text{ if } v\in (0.7,1]\\ \end{array} \right. \end{aligned}$$
(9)

The quantisation of the number of colours into several bins is done in order to decrease the number of colour used in feature matching. We propose the scheme to produce only 14 bins colour.

4.2 Local Feature Extraction

The local feature of objects are widely used in image matching and classification. A local feature descriptor takes into account the regions or objects to describe the image. In this work, the original leaf images are first decomposed using Haar wavelet before extracting local features. Wavelet transform is one of the best tools to determine where the low frequency and high frequency is. It involves in compression for decomposing the image into approximation and detail. The approximation sub-image shows the general trend of pixel values, and the three detail sub-images show the vertical, horizontal and diagonal details or changes in the image [7].

In wavelet transformation, low-pass filtering is conducted by averaging two adjacent pixel values, whereas the difference between two adjacent pixel values figured out for high pass filtering, as a result, it produces four sub-bands as the output of the first level Haar wavelet. The four sub-bands are \(LL_1\), \(HL_1\), \(LH_1\) and \(HH_1\) [14]. The process can be repeated to compute multiple scale decomposition, as in the two scales Wavelet shown in Fig. 1. The LL sub-band contains a rough description of the original image and is hence, called the approximation sub-band. The HH sub-band contains the high-frequency components along the diagonals. LH contains mostly the vertical detail information. HL represents the horizontal detail information. The sub-bands HL, LH and HH are called the detail sub-bands since they add the high-frequency detail to the approximation image [26]. Therefore, wavelet decomposes an image by reducing the resolutions of its sub-images and helps reduce the computational complexity in the proposed system and demonstrates that the image with 64 \(\times \) 64 resolution is sufficient to recognize leaf image [22].

Fig. 1.
figure 1

Wavelet coefficient structure [22]

Local features are then extracted from the decomposed leaf images. Many local feature descriptors have been proposed in the past. One of the most influential descriptors is local binary patterns (LBP) [24], which will be adopted in our work. The LBP is a simple but efficient, powerful operator to describe the local image pattern (image texture). It has been used in many areas such as image retrieval, automatic face recognition and detection, and medical image analysis, etc. [12, 22]. The LBP value is first obtained from the neighbourhood circular pixels using the central pixel. The value is then multiplied by binary weighting as final. The equations are as follows [30]:

$$\begin{aligned} LBP_{P,R}(x_c,y_c)=\sum _{i=0}^{p-1} s(g_P - g_c)2^p \end{aligned}$$
(10)
$$\begin{aligned} s(x)= \left\{ \begin{array}{ll} 1 &{} \text{ if } x\ge 0 \\ 0 &{} \text{ if } x< 0 \\ \end{array} \right. \end{aligned}$$
(11)

Where \(x_c \) and \(y_c \) are the coordinate of center pixel, P is circular sampling points or neighbourhood pixels of radius of R, \(g_p \) is grey scale value of P, \(g_c\) is centre pixels and s (sign) is threshold function. Examples of the circular neighbourhood are illustrated in Fig. 2 [13, 30].

Fig. 2.
figure 2

Circular neighbourhood for LBP [30]

4.3 Hybrid Descriptor

The framework of the proposed Hybrid Descriptor (HD) is illustrated in Fig. 3. The first step of the proposed method is to resize all images and make them to 256 \(\times \) 256 pixels. The two different sets of feature vectors are then extracted from the resized images.

The colour histogram is used to extract global features in RGB colour space. It has a large number of bins because an RGB histogram model with 256 bins per channel has around 16.7 million degrees of freedom (256 \(\times \) 256 \(\times \) 256 bins). As a result, we used HSV colour histogram to reduce bins number for each image by used Eqs. 19 and extracted global features.

As aforementioned, we rely on LBP for local features as it is a powerful technique to describe leaf texture. However, it is time consuming to process all pixels in images because the window size is fixed [29]. To overcome such a problem, we exploit Haar wavelet, specifically, the sub-band \(LL_2\) domain. The \(LL_2\) domain is the approximation coefficients of wavelet decomposition, which contains most of energy and represents the low frequency information of a leaf image. Some of HSV colour histogram features are weak because lack of chromaticity, on the other hand LBP features represent local features only so features are combined with HSV colour histogram features by using (OR) gate digital truths in order to substitute the lack or local features by combination features for capturing a robust feature vector. Any rules resulting a minimum distance error then becomes the classifier to classify the leaf image to the corresponding plant specie.

Table 1 presents the (OR) gate truths table; Figure 4 depicts the minimum distance classifiers.

Fig. 3.
figure 3

The Framework of HD method

Table 1. The OR gate truth table
Fig. 4.
figure 4

The minimum distance classifier

5 Experimental Evaluation

5.1 Experimental Design

The experimental evaluation of proposed leaf image classification method is designed following the general framework of pattern classification. The CLEF 2011 image dataset has been used in the experiment. The data set consisted of four different plant species; Cornus_mas, Magnolia_denudata, Ulmus_glabra, and Ulmas_ parvifolia. The data set was divided into two subsets, the training set and testing set. The training set contained 60 leaf images, in which each plant specie had 15 leaf images with various sizes, directions and surfaces. The testing set included 40 leaf images, in which each specie shared ten. Classifiers were learned from the training set by using the features extracted from the leaf images. When a query leaf image was given from the testing set, hybrid descriptors were then extracted and used to compare with the classifiers to classify the leaf image to its belonging specie. Figure 5 shows some of image samples of the training set and Fig. 6 depicts the step-by-step dataflow in the experiment.

Fig. 5.
figure 5

Sample leaf images in experimental dataset

Fig. 6.
figure 6

Experimental dataflow

Three typical leaf classification and recognition methods were selected as the baseline models in experiments; LBP [24], WavLBP [7], and HSV-CH [23]. The LBP method is simple, efficient to describe texture in leaf images. The WavLBP method is efficient in reducing computational complexity in feature extraction. The colour histogram (CH) is commonly used for global features extraction. HSV-CH invites CH into HSV and further reduces bins of colours and complexity in processing. The proposed HD method would compare with the three baseline models in the experimental evaluation.

The performance of experimental models was measured by Precision, Recall, F-measure and Accuracy, which are all commonly accepted in the related research community. They are defined as follows:

$$\begin{aligned} Precision= \frac{TP}{(TP+FP)}\times 100\,\% \end{aligned}$$
(12)
$$\begin{aligned} Recall= \frac{TP}{(TP+FN)}\times 100\,\% \end{aligned}$$
(13)
$$\begin{aligned} F measure =2.\frac{P.R}{P+R}\times 100\,\% \end{aligned}$$
(14)
$$\begin{aligned} Accuracy =\frac{(TP+FN)}{\mathcal {D}}\times 100\,\% \end{aligned}$$
(15)

TN (True Negative) denotes the case of a negative sample being predicted negative (e.g., a non-Cornus_mas leaf image being classified into the complement class of Cornus_mas correctly); TP (True Positive) refers to the case a positive sample being predicted positive (e.g., a Cornus_mas leaf image being classified into the class of Cornus_mas correctly); FN (False Negative) refers to the case that a positive sample being predicted negative (e.g., a Cornus_mas leaf image being classified into the complement class of Cornus_mas incorrectly); and FP (False Positive) denotes the case that a negative sample being predicted positive (e.g., a non-Cornus_mas leaf image being classified into the class of Cornus_mas incorrectly).

5.2 Experimental Result Analysis

In this section, we discuss the results of the experimental evaluation conducted on our proposed system. We first evaluated the performance of our method when we implemented it with individual method such as LBP, WavLBP and HSV-CH. After evaluating the effectiveness of individual methods, we then investigated the performance of our proposed method by comparing with baseline models. The experimental results are presented in Figs. 7, 8, 9 and 10 and Table 2.

Table 2. Comparison of average results
Fig. 7.
figure 7

Detailed experimental results in four methods for Ulmus_parvifolia dataset

Fig. 8.
figure 8

Detailed experimental results in four methods for Cornus_mas dataset

Fig. 9.
figure 9

Detailed experimental results in four methods for Magnolia_denudata dataset

Fig. 10.
figure 10

Detailed experimental results in four methods for Ulmus_glabra dataset

Performance of the Proposed Leaf Recognition System. In this experiment, we evaluated our proposed system aiming at improving the performance of leaf image recognition systems. From the results shown in Table 2, one may see that there are noticeably increasing performance in terms of precision and recall for our proposed method when compared to baseline methods. The system with proposed method achieved 93.33 % and 24.03 % for precision and recall, respectively. The proposed method also yields a higher accuracy value compared to others. This is because the proposed method based on two feature descriptors: local features relying on LBP taken out from \(LL_2\) sub band wavelet, and global features represented by colour histogram in natural colour to human HSV.

By further aggregating the obtained results from Table 2, Figs. 7, 8, 9 and 10 illustrate the detailed results in Precision, Recall, F-measure and accuracy for each method conducted on four different plant species. The results reveal that the proposed method is capable of recognising the leaf images of all four different species with high level of accuracy and precision.

Performance of the Leaf Recognition System with Different Methods. We first explored the performance of the experimental system employing baseline methods such as LBP, WavLBP and HSV-CH. Table 2 shows their averaged experimental results. From the results, we can see that the LBP method yields high degrees of precision and recall. However, the highest precision and recall results were achieved when applying Wavelet or HSV colour histogram methods before using LBP method. Such an observation reveals the difference of baseline models in terms of their capacity of leaf images recognition and provides practical justification for the development of our proposed method.

6 Conclusions

Aiming at improving the performance of images recognition and classification, a hybrid descriptors method has been introduced in this paper to recognise and classify leaf images for plant species. The methods adopts a hybrid descriptor combining both global and local image features extracted from leaf images. Experimental results show that the proposed system yields promising performance with high level of accuracy and precision. The work has made contributions to knowledge advancement in leaf image classification and plant specie recognition. In the future, the performance of our proposed system will be further improved by using additional combination methods. Further experimental evaluations will also be conducted using large, extensive datasets and comparing with more state-of-the-art related methods.