1 Introduction

Shape is an important feature used in representing an object efficiently and thereby it effectively characterizes the contents of an image in digital format [24]. Unlike color and texture, shape of an object is independent of capturing sensors and thus, is popularly used in many fields of computer vision. However, the challenge is to accurately represent the shape information even for an incomplete contour of an object. Such incomplete objects in real world occurs due to occlusion, noise, and clutters in natural images. This decreases the performance of shape recognition and localization to a great extent.

Unlike animals, human beings are capable to act rapidly to the visual inputs including incomplete shapes [28] and therefore, mimic to human vision system (HVS) is a challenge. Complete and incomplete shape identification system is important and indispensable for any digital object analysis. For example, consider a scenario shown in Fig. 1 where an incomplete diseased leaf is captured at a high and low vision/resolution using different devices; scanner and a smart phone, respectively. Contrary to color and texture information [34], shape features can easily be extracted even in a low vision environment.

Fig. 1
figure 1

A diseased leaf image captured using a standard scanner and a low vision smart phone, respectively

In agriculture and botanical research, plant surveying and discovering new species from millions of plant species are very challenging tasks and if the sample is incomplete, it adds to the complexity. It can be speeded up and improved if botanists are assisted by a computerized vision system. However, currently such computerized systems are neither mobile nor easily accessible. With the invention of smartphones, these computer-aided plant species identification systems can be accessed even by non-professionals, like farmers. Thus, in this paper, we propose an efficient plant biometric tool suitable for smartphones.

In literature, various shape representation approaches are presented for image/object retrieval including 3-D shape-based retrieval [30]. In late 70s, Lester et al. [17] applied heuristic search and least maximum cost for finding boundary of white blood cell. The common features like centroid, perimeter, hole count, area, eccentricity, set of moment invariants, etc. were used for shape analysis in [9] and [22], whereas in [12], different kinds of curvatures and transforms were used to represent the object with the generic constraints like isotropy, smoothness, and extensibility. The limitations of these approaches are that they are suitable for simple and smooth curves but not applicable for less informative irregular complex bodies. Venkatesh et al. [31] presented a symmetry based shape completion approach for object retrieval but cannot be always applied since all types of objects will not have symmetry in nature. Recently, Kurtek et al. [14] used asymmetric properties for shape descriptor. In [5], Chahooki and Charkari proposed a content-based image retrieval system using contour and region-based shape information differencing using Euclidean distance and achieved 92.24 % for MPEG-7 shape database. Similarly, in [8], authors have used arithmetic edge coding for shape retrieval.

Complexity of shape descriptors increases with rotation, scaling and transformation (RST) invariant features. In [20], authors compared various RST proven algorithms such as angular radian transform (ART), Zernike moments (ZM), curvature scale space (CSS) and Fourier descriptor (FD) for general shape representation. FD and CSS are contour-based descriptors whereas the other two are region-based. Although image moment descriptor lacks the contour information, it shows more credits in terms of RST invariance compared to other methods. CSS, proposed by Abbasi et al. [1], represented an image shape for similarity matching using arch-shape contours with a dimension complexity of 128. Other than these methods, Cope et al. [6], provided a detailed survey report on plant identification approaches and leaf representation methods. Almost all the algorithms discussed above are computationally costly and lacks global information.

Laga et al. [15] presented a Squared Root Velocity Function (SRVF) to represent global shape information of a plant leaf using Riemannian elastic metric. The average accuracy and time complexity of SRVF is high and results in failure of real-time mobile system. In 2014, Wang et al. [33] provided a Multiscale-ARCH-Height (MARCH) algorithm for mobile device to retrieve plant leaf image. Whereas in 2015, authors have proposed herarchial multi-task structural learning for large-scale plant species recognition [10]. Several other methods such as chain codes, medial axis, discrete Wavelet transform (DWT) and deformable templates were discussed in the literature [34], but they fail to serve RST invariance properly.

The ubiquity of modern smartphones makes them a perfect tool for the field expert replacement. Even an ordinary farmer can obtain the detailed information of a leaf image anywhere-anytime. Thus, this paper contributes on the low-cost computational algorithms for plant species identification with high accuracy and minimal energy consumption of mobile device. The important factor for proper shape representation is to achieve RST invariant property. Therefore, the proposed Angle View Projection (AVP) transform acquires 1D shape profile curve from 2D image representing shape with RST invariance. It also provides a solution for the shape occlusion caused due to insect, disease or noise. This species identification system is important because it will reduce the disease search database and will improve the performance rate.

AVP transforms 2D leaf to a unique 1D RST shape profile with continuous contour points of a leaf in a fixed four directions and hence named so. To preserve shape information and to overcome with the incomplete leaf information, the complete profile is compressed to a lossless domain via Discrete Cosine Transform (DCT). In frequency domain, DCT curve maintains the shape property of leaf and adopts its global feature. Analyzing the dimensionality of DCT features, principal component analysis (PCA) is applied followed by identification and recognition.

The most important motivation of using this AVP shape profile curve is the illumination problem that occurs while capturing the leaf (object). In a low lighting condition, unlike the color and texture information, shape pattern remains the same and hence can contribute in low vision identification system. The proposed method is implemented and evaluated with various different plant leaf datasets: Flavia leaf dataset [36], 100 plant species leaves dataset [19], Swedish leaf image database [29], Intelligent Computing Laboratory leaf dataset [13], and Diseased leaf dataset [26].

The rest of the paper is organized as follows. In Section II, the detailed construction of AVP shape profile curve and its characteristics are discussed. In Section III, AVP parameters and matching is covered. In Section IV, the proposed mobile architecture is presented. Results and comparisons with state-of-the-art is highlighted in Section V and finally, in Section VI, conclusion is made.

2 The angle view projection (AVP) representation

The shape of an object is the external boundary or the outline of the object used to describe it mathematically. In other words, shape is a set of points, \( \mathcal{S}=\left\{{\mathcal{S}}_1,{\mathcal{S}}_2,\dots {\mathcal{S}}_p\right\} \) such that these p points collectively form a curve α, where \( \alpha :\ {\mathbb{S}}^1\ \epsilon\ {\mathrm{\mathbb{R}}}^d \). Here, d is the dimension usually taken to be two in case of 2D shape. Therefore, a 2D plant leaf can be modeled as: α leaf , where α leaf (t) = {x(t), y(t)} and (x, y) are the 2D coordinates in shape space [11] at instance t. The foremost challenge in plant leaf identification is to segment leaf image from a complex background followed by extraction of boundary shape to represent it in \( {\mathbb{S}}^1 \). Mapping out the leaf contour in one direction, either clockwise or anticlockwise, from a fixed point, forms a unique shape curve α leaf [34]. The shape representation and metric analysis in such space are capable to quantify two different shapes. In 2D plane, a shape is invariant of transformations, i.e. rotation, scaling and translation. For example, if there is a shape α leaf in a non-linear shape space, \( {\mathbb{S}}^1 \) and after translating the shape is transformed to α leaf ' in the same space then α leaf  = α leaf ', because it does not affect the shape of the object but only translated. This can be represented mathematically as: \( {\alpha}_{leaf}\overset{T}{\to }{\alpha}_{leaf\hbox{'}} \), i.e., α leaf ' = T[α leaf ] = α leaf , where T is a translation function. Similarly, it is true for rotation and scaling of an object in the plane ℝ2. That is, rotation ℛ and scaling S sc are invariant in nature, \( {\alpha_{leaf}}_{{}^{\prime \prime }}=\mathrm{\mathcal{R}}\left[{\alpha}_{leaf}\right]={\alpha}_{leaf} \) and α leaf ' ' ' = S sc [α leaf ] = α leaf ; depending upon the samples per interval. But if the dimension of the space is changed α will be changed.

2.1 AVP curve

The input leaf image, say I leaf , is captured using mobile device at a resolution β in color space ℂ3 (it may be a colored, grayscale or binary image). This raises many challenges in front of researchers working in the field of image processing (IP) and CV, such as segmentation. Thus, to analyze a plant leaf and overcome these challenges, irrespective of lighting conditions, the 2D leaf is projected onto the proposed 1D unique AVP shape profile curve in \( {\mathbb{S}}^1 \).

But before performing AVP shape profile curve transform, the input image is pre-processed which includes binarization and normalization to make it translation invariant. The colored or grayscale image is transformed to a binary image by passing it through a 2-level threshold, λ = [λ min λ max ], as described in Eq. 1. In case the leaf is kept isolated against a constant background (Fig. 1a), the threshold (λ min  = λ max ) and acts similar to Otsu threshold [23].

$$ \left.{I}_{leaf}\overset{\lambda }{\to }{I}_{bw\_ leaf}\right|\ \begin{array}{c}\hfill {\lambda}_{min}\le {I}_{leaf}\left(x,y\right)<{\lambda}_{max}\ then\ {I}_{b{w}_{leaf}}\left(x,\ y\right)=1\hfill \\ {}\hfill else\ {I}_{b{w}_{leaf}}\left(x,\ y\right)=0\hfill \end{array} $$
(1)

In Fig. 2, I leaf is the input captured leaf image (Fig. 2a) which may be colored or gray and I bw_leaf is its λ image (Fig. 2b). Figure 2c shows the histogram of Fig. 2a with minimum and maximum threshold, transforming bins from I leaf to I bw_leaf , i.e. [0 255] to [0 1].

Fig. 2
figure 2

Colored leaf image and its binary image: (a) Normal leaf, I leaf [36], (b) binary image, I bw_leaf , and (c) histogram to decide λ

Further, I bw_leaf is cropped to standard resolution β ' with dimension (x ' × y′) consisting only leaf, as shown in Fig. 3b. This reduces the computation and preserves the leaf shape information from unwanted noisy background. Based on the outer white pixels, that is I bw_leaf (xy) = 1, the cropping algorithm is defined, Fig. 3. The mathematical representation for the same is formulated in Eq. 2, where I crop_leaf is the cropped leaf image transformed from I bw_leaf such that (xy) ≥ (x′, y).

$$ {I}_{crop\_ leaf}\left(A:C,\ B:D\right) = if\ \left(\left({I}_{b{w}_{leaf}}\left(A,:\right)==1\right)\&\left({I}_{b{w}_{leaf}}\left(:,B\right)==1\right)\&\left({I}_{b{w}_{leaf}}\left(C,:\right)==1\right)\&\left({I}_{b{w}_{leaf}}\left(:,D\right)==1\right)\right) $$
(2)
Fig. 3
figure 3

(a) I bw_leaf with boundary pixels, and (b) cropped leaf, I crop_leaf

Now AVP shape profile curve transform is applied on I crop_leaf , as in Fig. 4. From four different angle views the projected shapelet in \( {\mathbb{S}}^1 \), as shown in Fig. 4b–e, completes the shape profile. Thus, AVP profile of length (2x′ + 2y′ = 2(x′ + y′)) ranges within the interval [0 max(x′, y′)], in clockwise or anticlockwise direction. In Fig. 4, the 2D I crop_leaf is transformed to a 1D AVP shape profile curve for x ' = y ' = 100, i.e., x-axis and y-axis in Fig. 4b–e are the time instances and amplitude in respective directions.

Fig. 4
figure 4

(a) I crop_leaf is sampled for top view projection, (b) top view projection curve, (ce) other three view projections

In Fig. 4a, the dotted lines show the sampling interval for the top view T projection (∅ = 90°), as in Fig. 4b. The sampling interval may vary according to the computational cost of the processing device but increasing it will compromise with shape profile. Similarly, left view L (∅ = 180°), bottom view B (∅ = 270°) and right view R (∅ = 0°) projections are shown in Fig. 4c–e, respectively. The complete AVP shape profile of I leaf is the set of these four angle views, together in a continuous series {T, L, B, R}, as shown in Fig. 5. The combination must be in an order: either clockwise or anticlockwise, to maintain the symmetry in the shape profile. The AVP profiles, shown in Fig. 5b–c, are in anticlock order: top-left-bottom-right (TLBR) and (T’L’B’R’) view projections of Fig. 5a, respectively. The dotted lines in Fig. 5 show the different angle views and the red-green stars are the maxima and minima on the AVP curve, respectively. These stars may be further used to analyze leaf shape signature in respect of max-min.

Fig. 5
figure 5

(a) Different angles of view projection, (b) AVP shape profile curve of I leaf at 0° orientation and (c) I leaf at 45° orientation. The sharp change in both the profiles are the joining points of two shapelets to form the complete shape profile

The angles are selected in such a fashion that they can view a range of [0° 180°] in a single view spectrum. Suppose, in Fig. 5a, R is the right view of I leaf at orientation zero and if I leaf is rotated by 45°, the right view changes to R’. Similarly, other views are calculated. To represent a complete AVP shape profile curve [0° 360°], such shapelets are needed and therefore, in our experiment (90°, 180°, 270°, 360°) are preferred. In Section V, the detailed close comparison between 0 and 45° is shown.

AVP curve is further used in classification or clustering of shapes. In general, shape-based image retrieval is expected to have RST invariant properties. Since the leaf image is cropped to a standard format having only leaf and no space in ℝ2 to translate, it is translation invariant. But for scaling and rotation, AVP curve needs to be processed further prior going for classifier training and testing.

2.2 Scale normalization

As discussed above the shape is the set of boundary points in a profile with regular interval, depending upon the size of object. The size is basically the change in sample points and its amplitude. Suppose, a plant leaf I leaf with shape α leaf and size (x′ × y′) having sample points, \( \mathcal{S}=\left\{{\mathcal{S}}_1,{\mathcal{S}}_2,\dots {\mathcal{S}}_p\right\} \). Then if the size is transformed to (x″ × y″), such that ((x′ × y′) > (x″ × y″)), in case of down sampling, the sample set S is reduced to \( \mathcal{S}\hbox{'} \) where \( \left.\mathcal{S}\hbox{'}=\left\{\mathcal{S}{\hbox{'}}_1,\mathcal{S}{\hbox{'}}_2,\dots \mathcal{S}{\hbox{'}}_{p^{\prime }}\right\}\right|{p}^{\prime }<p \). Therefore, the I leaf is scaled to I leaf ' by a factor S sc , i.e., I leaf ' = S sc [I leaf ] then α leaf  ≠ α leaf '. To make these two AVP shape curves identical, I crop_leaf is normalized to a standard size (u × v). Note, minimum the (uv) is, minimum the computational cost is. But it compromises with the accuracy. Therefore, the accuracy and computational cost must be optimized. In Fig. 6a, the original leaf, I leaf of size (1200 × 1600) is shown and its corresponding AVP in Fig. 6d. In Fig. 6b, the same leaf is scaled to half of the original and its corresponding AVP is shown in Fig. 6e. Both the curves are similar except the number of samples and amplitude.

Fig. 6
figure 6

Down sampling of AVP curve: (a) I leaf , (b) \( \frac{I_{leaf}}{2} \), (c) I leaf rotated with θ°, (df) AVP of (ac) respectively

2.3 Rotation invariant

Since, rotating the leaf (Fig. 6c) changes the AVP profile (Fig. 6f), it itself is not a rotation invariant transform. Therefore, if I crop_leaf is rotated to a fixed direction such that the AVP shape profile curve is same for all conditions, it makes AVP a rotation invariant transform, as shown in Fig. 7.

Fig. 7
figure 7

(a) Binary version of 6(c), (b) rotated by angle, θ°

To summarize, suppose, I leaf is the input leaf image of resolution (x × y) then AVP shape profile, ξ can be defined using Eq. 3:

$$ \xi \left({I}_{leaf}\right)=AVP\left\{\mathrm{\mathcal{R}}\left({S}_{sc}\left(T\left({I}_{leaf},\ \left(x\hbox{'},\ y\hbox{'}\right)\right),\ \left({x}^{\prime },{y}^{\prime}\right)\right),\ \theta \right)\right\} $$
(3)

From experimental results (Section V), the proposed AVP algorithm achieves following advantages: (i) more accurate and robust, (ii) efficient and less complex, (iii) RST and noise invariant, and (iv) optimal for mobile devices.

3 AVP parameters and matching

Every I leaf is represented using reliable and robust RST invariant AVP shape profile for matching. The important factor in leaf recognition is to uniquely represent leaf in \( {\mathbb{S}}^1 \). So the accuracy directly depends on the level of uniqueness of the shape curve. In this section, we apply 1D DCT compactness over 1D AVP shape curve and perform matching operation. The section also discusses various parameters and the complexity of AVP transform.

3.1 Extracting features from AVP

Feature extraction is a phenomenon of dimension reduction transforming leaf shape to a reduced set of features called feature vector. Feature vector is responsible to represent correct information extracted from I crop_leaf . In this paper, 1D AVP is used as the input to feature extraction, transformed to a sum of cosine functions at different frequencies to overcome the minor changes caused by pathogens. Since DCT is a special case of Discrete Fourier Transform (DFT) with only real numbers, there are eight different variants of DCT out of which four are very common in signal processing. In communication and information technology, it is used to store and transmit 1D or 2D data such as speech and image because of its energy compactness nature. DCT is a fast computational transform, compared to DFT and DWT. It separates the signal into sub-bands of different frequencies in the frequency domain. DCT compresses the energy content of signal to its first few coefficients and decreases continuously till the last coefficient. Here, in this paper, DCT is formulated as shown by Eq. 4 and its inverse is given by Eq. 5. This is known as DCT-II.

$$ \begin{array}{c}\hfill DC{T}_k=w(k){\displaystyle \sum_{i=0}^{N-1}}AV{P}_i \cos \left[\frac{\pi }{N}\left(i+\frac{1}{2}\right)k\right]\hfill \\ {}\hfill =w(k){\displaystyle \sum_{i=0}^{N-1}}AV{P}_i \cos \left[\frac{\pi k}{N}\left(\frac{2i+1}{2}\right)\right]\hfill \end{array} $$
(4)
$$ IDC{T}_i=w(k){\displaystyle \sum_{k=0}^{N-1}}DC{T}_k \cos \left[\frac{\pi k}{N}\left(\frac{2i+1}{2}\right)\right] $$
(5)
$$ \begin{array}{cc}\hfill \mathrm{where}\hfill & \hfill w(k)=\left\{\begin{array}{c}\hfill \frac{1}{N},\kern1em k=1\hfill \\ {}\hfill \frac{2}{\sqrt{N}},\ 2\le k\le N\hfill \end{array}\right\}\hfill \end{array} $$
(6)

N is the length of AVP, and i, k = {1, 2, … N}. Note that the length of DCT and AVP are same. Due to its linear transform and invertible nature, the leaf shape is reconstructed with less error produced by Gibbs phenomenon [32]. The DCT of leaf in Fig. 5a with its AVP curve in Fig. 5d is shown in Fig. 8. Here, in Fig. 8a, the DCT spectrum is spread to 2(u + v), where, u = v = 100. In Fig. 8b, the energy contribution of coefficients in DCT, FFT and DWT are presented. Analyzing the DCT spectrum, the first coefficient has the maximum energy with major contribution and the rest all are decreasing towards zero. To justify the previous statement, the energy distribution of DCT is calculated using Eq. 7 and we found that 96.97 % of energy, Ε DCT is given by the first 10 coefficients and the rest adds to 3.08 % only. That is, if the coefficients are increased to first 20, the energy distribution is 98.71 %, i.e., an improvement of energy by +1.74 % only but the dimension increases by +10, which is a drastic increase in the computational complexity, as tabulated in Table 1. Therefore, the DCT spectrum can be reduced from 2(u + v) to its first few coefficients without losing any major information, ultimately optimizing the computational cost.

$$ {\mathrm{E}}_{DCT}={\displaystyle \sum_{i=1}^{N-1}}DC{T_i}^2 $$
(7)
Fig. 8
figure 8

(a) Complete DCT spectrum, and (b) Energy distribution of DCT, DWT and FFT coefficients

Table 1 DCT energy distribution and feature length optimization

In Table 1, an optimization ratio between energy Ε DCT and feature space dimension #(DCT k ) is computed. The ratio shows how an increase in the dimension of AVP-DCT increases the complexity but the contribution to energy gain is decreased drastically. The optimum tradeoff between #(DCT k ) is 3 with Ε DCT  = 90.06 but even if #(DCT k ) = 1, Ε DCT is more than 3/4 of the total energy of DCT. Thus, to represent I leaf properly only the first κ coefficients are sufficient. Now, these κ coefficient vector is passed through PCA to further optimize the vector before classification. PCA, an unsupervised dimension reduction transform, maximizes the variance of the projected input vector DCT κ forming PCA κ in the PCA space, reducing the feature space further for classification.

3.2 AVP matching

As proposed, every 2D leaf shape in our database are represented by PCA κ of 1D of AVP-DCT of shape profile curve. It retains the local and global shape properties of the input leaf for accurate recognition.

Majority of research in shape similarity retrieval goes with shape based image retrieval by predicting first few possible objects similar to the query image [1, 9, 12, 19, 20, 33, 34]. In this paper, a well-known simple non-parametric k-Nearest Neighbor (k-NN) classifier is used to recognize the current plant species. The leaf is classified to the class of the majority of neighbors among the k nearest neighbors. The algorithm for k-NN classifier is simple, described as: (i) calculate distance for each labeled instances in database with the query sample, (ii) order the distances in increasing sequence, and (iii) select the first k for majority voting. Therefore, the complexity of k-NN classifier directly depends upon the feature length used and the size of the training dataset. That is, if Ν train is the size of training dataset and Ν feature is the feature vector length (Ν feature  = {1, … κ}) then the computational complexity of k-NN will be O(kΝ train Ν feature ).

3.3 Computational complexity

In shape recognition, the computational cost is calculated as the sum of cost required for feature extraction and time needed for classification. Extraction of AVP shape profile of resolution (x × y) with 2(u + v) contour points requires computation cost O(uv). From Eq. 4, uv = N, and therefore, O(uv) = O(N). Feature extraction from AVP shape using DCT requires O(N log2 N). Thus, the total computational cost required to form AVP shape profile and apply DCT to find the coefficients is O(N) + O(N log2 N) = O(N log2 N). At the classification end, the time complexity required for k-NN is O(kM), where k is the number of nearest neighbor and M = (Ν train Ν feature ) is the dataset used for classification. If k = 1 then complexity reduces to O(Μ). Therefore, the total time complexity involved in our proposed system is (O(N log2 N) + O(Μ)) and if Μ is reduced by 25 % the complexity is directly affected making it suitable for mobile device storage and computation.

3.4 Properties of AVP representation

The proposed AVP shape profile can solve many mining issues such as visualization, representation and retrieval of anomalies, but for limitation, the paper focuses on leaf identification at different rotation and scaling conditions. The proposed AVP method, has the following key properties that make it more suitable for real-time mobile leaf image identification system.

3.4.1 Global information

Global information of a leaf is the information that separates two leaves from each other. For example, two shapes may have same local deformation even if they are globally different, such as circle and ellipse, and so may be misclassified. Therefore, AVP projects all the possible global information obtained in complete AVP leaf shape profile from all different views by using DCT κ, as depicted in Fig. 6.

3.4.2 Shallow concavities

The shallow concavities are easily handled using AVP, as in Fig. 9a which is difficult in case of CCD [34] if the center point is at ‘A’. Unlike CCD or any other existing algorithm, AVP will not overlook the concavities in leaf and at least in one of the shapelets it will be captured. Whereas, AVP shape profile of a deep shallow concavity leaf may be missed, marked in dotted ellipse in Fig. 9b, and thus, it may be the future scope of the paper.

Fig. 9
figure 9

(a) Shallow and (b) deep concavities in leaf images

3.4.3 RST invariance

As discussed above, rotating an object causes a circular shift to an AVP shape profile which is easily detected and corrected using a standard orientation (Fig. 7) before AVP mapping. Similarly, the size of an object does not change the AVP curve except the amplitude and so it is scaled to a standard size, i.e. (u × v). Lastly, the I leaf is transformed to I crop_leaf , to make the window as small as the leaf. This makes it translation invariant.

3.4.4 Partial occlusion

A minor occlusion caused due to the attack of disease or insect does not change the AVP curve, as in Fig. 10a–c. However, if major portion of the leaf is destroyed, the AVP will be changed and behave as a new shape, as shown in Fig. 10d. AVP includes four shapelets from different view angles and so if any single view is damaged, DCT takes care of that. Thus, a damage of 10–20 % does not disturb the AVP-DCT feature and results approximately the same, the detail is discussed in Section V.

Fig. 10
figure 10

Leaf shape damaged due to disease attack [26]: (a) left view damaged, (b) top view damaged, (c) left and right view damaged and (d) incomplete leaf dataset (up to 40 % distorted) [19]

3.4.5 Low vision

While the texture and color information of leaf directly depend on the lighting condition in which it is captured, the shape information is light tolerant. The AVP shape profile maps the 1D curve based on the outer appearance of leaf which is visible even in low illumination. Thus, the proposed method is robust to low vision mobile images.

Suppose a scene of agro-bots moving in a farm field to monitor crops and the pathogens attacked on them is captured in a bad lighting condition with incomplete shape and no proper texture visible. In that case, all the existing algorithms will fail. But AVP transforms the shape information to identify the species, at real-time in real scenario without any laboratorial setup.

4 Proposed mobile architecture

In the current technological world, mobile devices are capable of computing CV tasks such as automated leaf identification [15, 33]. Android is a commonplace and adoptable for mobile field guide used by both farmers and common public. Mobile devices are limited with RAM, processing power, bandwidth and dependent on battery, so the algorithms are limited. Therefore, the architecture is designed keeping this in view.

The proposed mobile architecture for automated leaf detection system using AVP shape profile is discussed and shown in Fig. 11. The architecture is divided into three modules: pre-processing, feature extraction and matching. Each module is efficient and fast to be directly executed on mobile, with computing support (Server) if needed. The system Agent monitors the health of mobile such as battery, computing power, RAM and bandwidth and subsequently offloads some of the processes to the server via wireless communication channel available: 2G, 3G, LTE, or Wi-Fi, whichever is fast and available at that time. In our proposed system, both online and offline interfaces are designed. If the computation requires more power, the module is transmitted to the Server via Wi-Fi (in our case), and revert the result back on the screen by the same TCP connection. The 2D textual information is augmented over the captured leaf in real-time. This technology is called mobile see-through.

Fig. 11
figure 11

Mobile architecture for the proposed plant species identification system

As in Fig. 11, the pre-processing is always performed on mobile device and rest is decided by Agent on mobile, monitoring the health. Here, in this architecture, three scenarios are considered: (1) execute everything on mobile device, (2) perform pre-processing on mobile device and rest offload on remote server, and (3) extract features on mobile device and transmit only the feature vector to the server for classification, as labeled in Fig. 11.

5 Results and comparison with other methods

In this section, the proposed AVP mobile based automated low vision plant biometric system is evaluated and compared with different existing methods and datasets. The system is evaluated using Matlab R2013b and then ported on Android platform for real device testing.

5.1 Dataset and performance evaluation

As discussed earlier, the numerical details of five different datasets used in this paper for experiments and validations of the proposed system are shown in Table 2 below. Here, all datasets are publicly available except the last Diseased leaf dataset captured using different resolution mobile devices. The sample images from all the datasets, one sample per species, are shown in Fig. 12.

Table 2 Datasets used
Fig. 12
figure 12

Datasets: (a) Flavia Leaf Dataset [36], (b) 100 Plant Leaf Dataset [19], (c) Swedish Leaf Database [29], (d) ICL Leaf Dataset [13], and (e) Diseased Leaf Dataset [26]

For performance analysis, n-fold cross validation is used where each datasets are divided into n-folds and at a time a single fold is used for testing and rest for training. Here, n is taken as ten. This is applied for all the datasets to calculate the accuracy from the confusion matrix. In this section, each individual dataset is compared with other existing methods and their performance analysis are discussed.

  1. 1.

    Flavia Leaf Dataset

    The first dataset used in this paper is Flavia leaf dataset [36]. Flavia dataset is very popular and used by many researchers in designing several methods for plant identification. In Table 3, the accuracy comparison of multi-scale convexity concavity representation (MCC) [2], triangle-area representation (TAR) [3], inner distance (IDSC) [18], triangle side lengths and angle representation (TSLA) [21], Multiscale-ARCH-height (MARCH) [33] with the proposed AVP method on Flavia leaf dataset is shown. Figure 12a shows the sample of Flavia dataset with 32 different species.

    Table 3 Accuracy rates for Flavia leaf dataset [36]

    As seen in Table 3, the accuracy of the proposed approach is comparatively better than all other existing methods. The MARCH [33], with the maximum accuracy was over-ridden by the proposed DCT of AVP shape profile curve. The experiment is also carried out with different classifiers at different feature lengths, see Fig. 13. The classifiers involved are Naïve Bayes (NB), Sequential Minimal Optimization (SMO), k-NN, Meta-classifier, and Random Forest (RF). They all are of different natures. In Fig. 13, both, k-NN and RF are performing good but the computational complexity of RF is bit high compared to the earlier one and so is the reason for selecting k-NN classifier for mobile device. As the feature length increases, the error rate for these two classifiers remains marginal whereas, in other classifiers, it changes drastically. The accuracy is measured using the 10-fold cross validation and thus, results in an accuracy of 97.95 %, as shown in the zoomed window in Fig. 13.

    Fig. 13
    figure 13

    Accuracy comparison for Flavia leaf dataset with different classifiers (k-NN best output)

    The detailed experiment on Flavia leaf dataset with varying feature vector length is graphed in Fig. 14. As in Fig. 14, the classification of k-NN with k = 1 for vector length #1 gives the best result, in terms of accuracy and complexity.

    Fig. 14
    figure 14

    k-NN classifier at: (ad) 15, 25, 50 and 80 % of Flavia dataset, respectively

    From Fig. 14, we can see that even 15 % of Flavia dataset is capable to produce 90.113 % of accuracy with k = 1, whereas at 25 % of dataset it is 95.2 %, 50 % of dataset results 96.875 % and at 80 % it is 98.8024 %. It can be seen that even for 25 % of dataset it results better than other existing methods (Table 3), reducing the classification complexity.

    In analysis, AVP curve was experimented at different standard sizes (w × h) and was also tested at 45° view projection (Fig. 15), as discussed in Section II. Here, in Fig. 15a, the size was set to (64 × 64), (100 × 100), and (110 × 110) for experimenting performance with varying image size and found that (64 × 64) outperforms all others. Whereas the accuracy change between 0° and 45° angle view is very marginal and thus, any angle view projection can be used for leaf identification.

    Fig. 15
    figure 15

    (a) Angle view and (b) size (w × h) comparison at 10-fold

    The experiment was also performed to validate the role of PCA over AVP-DCT compactness. It improves the decision making process and reduces the feature space dimension. As in Fig. 16, PCA on AVP-DCT performs better followed by AVP-DWT, AVP-DCT without PCA, AVP-DWT without PCA and lastly AVP shape profile curve. The AVP shape profile itself gives an accuracy of 81.8727 % with vector length #400 for (100 × 100) and 60.1441 % with #40.

    Fig. 16
    figure 16

    Error rate of AVP shape profile curve with and without PCA

  2. 2.

    100 Plant Leaf Dataset

    The 100 plant leaf dataset is the second largest dataset with 16 samples for each of the 100 different species used. It is a binary leaf shape dataset collected from Royal Botanic Gardens, Kew, UK, as shown in Fig. 12b. Adopting the same n-fold cross validation performance evaluation, the dataset is compared with Fourier descriptor, shape-texture [4] and Hierarchical string cuts (HSC) [35] algorithms, as given in Table 4.

    Table 4 Accuracy rates for 100 plant leaf dataset [19]

    The shape-texture method [4], extract features from a contour signature of leaf and classifies them using the Jerey-divergence measure which results in 81.1 % of accuracy. Unlike [4], AVP shape profile curve is reduced to first few PCA coefficients and so it reduces the classification complexity. Our approach achieves an accuracy of 95.19 % for 100 black-&-white species with 10-fold cross validation.

  3. 3.

    Swedish Leaf Dataset

    Another well-known Swedish leaf dataset is also tested for the AVP performance and is compared with IDCS, MCC, TAR, symbolic representation, Fourier descriptor, TSLA, and MARCH [33]. The proposed accuracy rate, as seen in Table 5, is comparatively very high with feature vector length #5. The closest algorithm, MARCH has an accuracy of 96.21 % with #98 which is a high dimension feature space, and so may not be efficient for smartphones.

    Table 5 Accuracy rates for Swedish leaf dataset [29]

    The Swedish leaf dataset, as shown in Fig. 12c, was collected in a project associated with Linkoping University and Swedish Museum of Natural History [29, 33]. The accuracy summarization in Table 4 uses gray image dataset for all 15 species.

  4. 4.

    ICL Leaf Dataset

    The ICL dataset is the largest dataset with 220 different species and 26 samples for each class, as shown in Fig. 12d, and used in this paper. For evaluation of AVP representation, the accuracy measure is used. It achieves 96.50 % of accuracy with #3 vectors. Table 6, summarizes the performance of different existing methods with their different feature space lengths.

    Table 6 Accuracy rates for ICL leaf dataset [19]

    With three feature dimensions, AVP representation separates 220 species with a high accuracy which was not possible even with #12288 dimension IDCS space [18]. Thus, AVP represents plant leaf better compared to other methods.

  5. 5.

    Diseased Leaf Dataset

    Finally, the last dataset is a diseased leaf dataset [26] collected from Forest Research Institute (FRI), Dehradun, India for five diseased plant leaves. The samples were collected via different mobile devices at different resolutions (2MP, 3.2MP, 5MP, 8MP), resulting a total of 297 samples, as shown in Fig. 12e.

    Table 7 summarizes the accuracy comparison of diseased leaf with DCT, DWT and FFT of AVP curve. The important thing to be noted in this dataset is that the leaves are occluded due to attacks of pathogens and so is more challenging compared to previous datasets used in this paper. It also plays an important role in designing and validating low vision leaf detection and identification based on partial incomplete shape information. In Table 7, the proposed approach gains an accuracy of 90 % for wrinkled edges and diseased leaf images, which can be easily improved by increasing the dataset.

    Table 7 Accuracy rates for diseased leaf dataset [26]

5.2 Partial occlusion

In this paper, the diseased leaf dataset is partially occluded due to the disease attacks such as wrinkling of edges and it changes the texture of leaves [27]. Thus, any texture or color based approach will not be suitable. There is no partial leaf dataset publicly available, and therefore, the diseased leaf dataset [26] is tampered to achieve the objective. The dataset includes occlusion from 5 to 20 %, as in Fig. 10a–c, and results in 90 % accuracy. This high accuracy is achieved because the AVP shape profile curve includes all the shapelets and so even if one shapelet is distorted it does not affect much. The problem of correct identification occurs if more than one shapelets are damaged, as in Fig. 10c. The lost information is somewhat approximated by DCT compaction and so helps in proper classification. DCT compact the information into coefficients that lie in range of the original leaf curve. It was also tested on a sub-dataset of 100 Plant Leaf dataset [19] with total 30 images, 6 from five species, as shown in Fig. 10d. The accuracy we achieved is 72 % including 40 % of distorted leaves.

Other than these, the performance was also measured by calculating accuracy over varying feature dimensions. Figure 17 shows an accuracy comparison bar graph for various datasets used in this paper along with changing feature vector length. It shows that even in one dimension AVP-DCT feature space the error rate is minimized. The maximum accuracy achieved by the proposed representation is 100 % for Swedish dataset using simple 1-NN classifier. In the bar graph, it is shown that the proposed representation performs better output with k-NN classifier. After the species classification, the leaf can be further examined for disease diagnosis by calculating the gray level co-occurrence matrix of Gabor Wavelet transform to reduce the misclassification of pathogen symptoms [27].

Fig. 17
figure 17

Accuracy comparison of 1-NN with varying feature space for different datasets

5.3 Testing on mobile platform

The proposed AVP-DCT algorithm is implemented and deployed on Android, a popular mobile operating system, using Android SDK in Eclipse Windows environment. The AVP-DCT is well optimized for mobile devices and for testing, Quad-core 1.2GHz cortex-A7 Micromax A116 Canvas HD with 1GB RAM and 1GHz Micromax A57 Ninja 3 with 521 MB RAM were used. The proposed algorithm is set to use 50 MB of RAM, which was never reached and the storage of 35 MB (maximum) is used from phone memory for storing 25 % of datasets. Here, the mobile Android app is called ‘AgroMobile’, our ongoing project [25].

When a user captures a plant leaf image, the Agent calculates the network bandwidth and the capacity of mobile processor and decides accordingly whether to compute on mobile or transmit over Server by using the available wireless channels [26]. An Agent estimates the time and power consumed while processing the query input leaf image, both feature extraction and classification, using online and offline database. The time consumption in offline processing of query image involves many factors such as size of image, type of decoding and channel bandwidth. If the size of image is down sampled then the computing time and bandwidth required to transmit is reduced which directly affects the energy consumption of mobile but degrades the quality [26]. Thus, a pre-processing improves the efficiency of the proposed system, in both online and offline. The time complexity of the proposed system on mobile and server is shown in Table 8 for ICL dataset (the largest dataset used). Compared to MARCH [33], our system responds in less time and is more accurate. Here, the time complexity is computed for (64 × 64) with k = 1, M = 25 and IEEE 802.11b/g/n.

Table 8 Time complexity of proposed system (ICL dataset) for (64 × 64)

A comparison is also made in terms of complexity of AVP and the existing algorithms used for plant leaf identification. In Table 9, the complexity is compared with AVP-DCT.

Table 9 Comparison of computational complexity of different methods

The snapshot of our proposed system is shown in Fig. 18 using Android OS. The final decision, the plant species name and their other details will be augmented over the mobile screen. For the real field experience, the proposed system is designed with a minimum human-mobile interaction (HMI) - a mobile see-through system. As a conclusion, the proposed system is easy, simple, fast and accurate for mobile based plant leaf biometric system in low vision. In future, due to large biodiversity, the paper may be extended to zero-data zero-shot learning system as was proposed in [16].

Fig. 18
figure 18

Snapshot of the proposed low vision plant leaf biometric system (executed on BlueStacks)

6 Conclusion

In this paper a novel shape profile curve transform is presented, named Angle View Projection (AVP), to represent a 2D binary leaf into a 1D shape profile for mobile platform. AVP is an RST invariant transform describing a leaf efficiently, accurately, and compactly for low computing devices. DCT compacts AVP shape profile, helping in identifying partially tampered leaves too. It also reduces the computational cost and energy consumption of a mobile device while classification. The method was tested on 5 different challenging datasets among which one is diseased leaf dataset including incomplete leaf information. AVP innovatively represents the shape resulting in best, compared to other state-of-the-art, even in a small feature space. It adds a mobility to botanical knowledge and helps community to understand nature more closely and easily. The proposed plant leaf biometric system intelligently decides whether to offload the process or to compute on mobile. It works even in low visibility and in future, the method can be further optimized to reduce the battery consumption of a mobile device with extra augmented information, making it more realistic for farmers.