Keywords

1 Introduction

Conserving earth’s biodiversity for future generations is a fundamental global task. 20 % of the all world’s plants are already at the edge of becoming extinct [1] and many methods must be combined to achieve this goal. Saving flora biodiversity involves mapping plant distribution by collecting pollen and later identifying and classifying them in a laboratory environment. Pollen classification is a qualitative process, involving observation and discrimination features [2]. The manual method is depending on experts, but takes large amount of time. Therefore pollen grain classification using computer vision is highly needed in Palynology. Palynology is the study of external morphological features of mature pollen grains [3]. Several characteristic features such as leaf, flower, seed etc., of plants are used to determine the rank of the taxa (a taxonomic unit), of which Palynological evidence has proven useful in verifying relationships in established taxonomic groups. Pollen grains are distinguished primarily by their structure and surface sculpture (Texture) [4]. There are approximately 300,000 species of flowering plants and these are classified under 410 families as per Takthajan system of classification [5]. As per the general study by taxonomists, in each and every family there are plants whose external characteristics looks similar but their identities are under dispute (doubtful of their species).

The main objective of classification of pollen grains is to solve the species and family of the plants which are under dispute in the field of plant taxonomy. Classification of pollen grains also finds its applications in, identifying pollens available in the environment which causes allergy (Aerobiology), to solve legal problems (Forensic palynology), study of pollens in fossils (Quaternary Palaeopalynology) and study of botanical and geographical origin of honey (Melissopalynology) etc.

Classification of pollen grains using image processing techniques focuses on getting maximum quality output. Many attempts have been made to automate identification, classification and recognition of pollen grain by the use of image processing.

In [6] non linear features from pollen grain images extracted using wavelet transforms. The extracted features are used to perform the classification of pollen grain images using self organizing map (SOM) neural network. Also attempts have been made for pollen texture identification using neural network multi layer perceptron (MLP) technique over the statistical classifier methods [7]. In [8] a prototype of a system presented for classifying the two genders of pollen grains of three types of plants of the family Urticaceae. Here the classification is based on shape analysis using area, perimeter and compactness as features.

A work carried out for recognition of pollen grain images. Five types of pollen grains are classified based on surface texture and the geometric shapes. Surface texture extracted using Gabor transform and geometric shapes using moment invariants with artificial neural network as a classifier [9]. Work has been done for investigation of feasibility of the application of computer vision, to determine in a fast and precise way, the flower origin of pollen from honey of northwest Spain [10]. They classified the pollen grain images using support vector machine (SVM) and multi-layer perceptron (MLP), using a minimum distance classifier. Specifically, several well-known classifiers, k-nearest neighbor (KNN), support vector machine and multi-layer perception are used to increase the classification rate. The method was to identify honeybee pollen. This work mainly focuses on the improvement of the classification stage. The combination of SVM classifier and local linear transformations (LLT) texture vector achieved the best performance to discriminate among the five most abundant plant species from three geographical places in north-west of Spain.

Almost all the works reported in literature for classification of pollen grains are limited to very few families or dependent on specific family or area. No work has been carried out for classification of the pollen grains independent of families. In this work we designed the model for pollen grain classification which is independent of families.

The rest of the work is organized as follows. In Sect. 2, we present a texture based model for classification of pollen grain images. Details of experimentation are discussed in Sect. 3. The paper is concluded along with the scope for future work in Sect. 4.

2 Proposed Model

The proposed model has two stages feature extraction and classification. In training phase, from a given set of pollen grain images the texture features (Wavelet/Gabor/LBP/GLDM/GLCM) are extracted and used to train the system. In classification stage, from a given unknown test pollen grain image, texture features are extracted and these features are queried to nearest neighbor (NN) classifier to label the unknown pollen grain. The block diagram of the proposed model is given in Fig.  1.

Fig. 1
figure 1

Block diagram of proposed model

2.1 Feature Extraction

Surface texture of the pollen grain plays very vital role in pollen classification. The outer surface of the pollen grain is covered with sculpture elements and it has different structure of apertures, a thin region through which one pollen grain can be differentiated from another. Hence in this work we recommend to use the texture feature for classification of pollen grains. Different texture features such as Wavelet, Gabor Wavelet, Local Binary Pattern (LBP), Gray Level Co-occurrence Matrix (GLCM), Gray Level Difference method (GLDM) and their combinations are studied here. The following sub sections provide an overview of above mentioned texture features.

2.1.1 Wavelet Transformation

Wavelet transforms are an alternative to the short time Fourier to overcome problems related to its frequency and time resolution properties. The basic idea of discrete wavelet transform (DWT) is to provide the time–frequency representation. In two dimension DWT, a two dimensional scaling function\( \varphi (x,\,y) \) and three two dimensional wavelets\( \mathop \psi \nolimits^{H} (x,\,y),\,\mathop { \, \psi }\nolimits^{V} (x,\,y),\,\mathop { \, \psi }\nolimits^{D} (x,\,y) \) are required. Each one is the product of two one dimensional functions. Excluding the product which produce one dimensional results, like\( \varphi (x) \, \psi ( {\text{y),}} \) the four remaining products produce the separable scaling function,\( \varphi (x,\,y) = \varphi (x) \, \varphi (y) \), and separable directly sensitive wavelets\( \mathop \psi \nolimits^{H} (x,\,y) = \psi (x) \, \varphi ( {\text{y),}}\, \, \mathop \psi \nolimits^{V} (x,\,y) = \varphi (x) \, \psi ( {\text{y),}}\,{\text{and}}\,\mathop \psi \nolimits^{D} (x,\,y) = \psi (x) \, \psi ( {\text{y)}} \). These wavelets measure functional variables, intensity variables for images along different directions.\( \mathop \psi \nolimits^{H} \) measures variation along column (horizontal edges),\( \mathop \psi \nolimits^{V} \) responds to variation along rows (vertical edges) and\( \mathop \psi \nolimits^{D} \) corresponds to variation along diagonals. The two dimensional wavelet functions based on scaling and translation are,\( \mathop \varphi \nolimits_{j,\,m,\,n} (x,\,y) = \mathop 2\nolimits^{\frac{j}{2}} \, \varphi \left( {\mathop 2\nolimits^{j} x - m,\,\mathop 2\nolimits^{j} y - n} \right) \) and\( \mathop \psi \nolimits_{j,m,n}^{i} (x,\,y) = \mathop 2\nolimits^{\frac{j}{2}} \mathop \psi \nolimits^{i} \left( {\mathop 2\nolimits^{j} x - m,\,\mathop 2\nolimits^{j} y - n} \right) \),\( \, i = \left\{ {H,\,V,\,D} \right\} \) where indexi identifies the directional wavelets\( \mathop \psi \nolimits^{H} (x,\,y),\,\mathop { \, \psi }\nolimits^{V} (x,\,y)\,{\text{and}}\,\mathop { \, \psi }\nolimits^{D} (x,\,y) \). The discrete wavelet transform of image\( f(x,\,y) \) of size\( M \times N \) is,

$$ \mathop W\nolimits_{\varphi } (\mathop j\nolimits_{0} ,\,m,\,n) = \frac{1}{{\sqrt {MN} }}\sum\limits_{x - 0}^{M - 1} {\sum\limits_{y = 0}^{N - 1} {f(x,\,y)\mathop \varphi \nolimits_{{\mathop j\nolimits_{0} m,\,n}} (x,\,y)} } $$
(1)
$$ \mathop W\nolimits_{\psi }^{i} (j,\,m,\,n) = \frac{1}{{\sqrt {MN} }}\sum\limits_{x = 0}^{M - 1} {\sum\limits_{y = 0}^{N - 1} {f(x,\,y) \, \mathop \psi \nolimits_{j,\,m,\,n}^{i} } } (x,\,y),\quad i = \left\{ {H,\,V,\,D} \right\} $$
(2)

j 0 is an arbitrary starting scale and the coefficients\( \mathop W\nolimits_{\varphi } (\mathop j\nolimits_{0} ,\,m,\,n) \) define an approximation off (x,y) at scalej 0. The\( \mathop W\nolimits_{\psi }^{i} (j,\,m,\,n) \) coefficients add horizontal, vertical and diagonal details for scalesj ≥ j 0 normallyj 0 = 0 andN = M = 2 so thatj = 0, 1, 2, …J – 1 andm = n = 0, 1, 2, … 2j – 1 [11].

2.1.2 Gabor Wavelet transforms

It is similar to the short time Fourier transforms, the Gabor wavelet transforms has been utilized as an effective and powerful time–frequency analysis tool for identifying rapidly varying characteristics of wave signals. The use of Gabor filters in extracting texture features motivated by several factors. These filters are considered as orientation and scale tunable edge and line detectors, and the statistics of these features in a given region are used to characterize the texture information [12]. A two dimensional Gabor functiong(x,y) and its Fourier transformG(u,v) is,

$$ g(x,\,y) = \left[ {\frac{1}{{2\pi \sigma_{x} \sigma_{y} }}} \right]\,\exp \,\left[ { - \frac{1}{2}\left[ {\frac{{x^{2} }}{{\sigma_{x}^{2} }} + \frac{{y^{2} }}{{\sigma_{y}^{2} }}} \right] + 2\pi jWx} \right] $$
(3)
$$ G(u,\,v) = \exp \,\left\{ { - \frac{1}{2}\left[ {\frac{{\left( {u - W} \right)}}{{\sigma_{u}^{2} }}^{2} + \frac{{v^{2} }}{{\sigma_{v}^{2} }}} \right]} \right\} $$
(4)

where\( \mathop \sigma \nolimits_{u} = \frac{1}{{2\pi \sigma_{x} }}\,{\text{and}}\,\mathop \sigma \nolimits_{v} = \frac{1}{{2\pi \sigma_{y} }} \). Gabor function forms a non-orthogonal but a complete set. Letg(x,y) be the mother wavelet by dilation and rotations ofg(x, y) through the generating function,\( \mathop g\nolimits_{mn} (x,\,y) = \mathop a\nolimits^{ - m} g(x^{\prime } ,\,y^{\prime } ), \, a > 1,\,m,\,n \) are integers, where\( x^{\prime } = a^{ - m} (x\cos \theta + y\sin \theta ),\,{\text{and}}\,{\text{y}}^{\prime } = a^{ - m} ( - x\sin \theta + y\cos \theta ) \) and\( \theta = \frac{n\pi }{N}\,{\text{and}}\,N \) is the total number of orientations anda −m is the scale factor. Gabor wavelet transform of a imagef(x,y) is given as,

$$ \mathop W\nolimits_{mn} (x,\,y) = \int {f(x_{1} ,\,y_{1} )} g_{mn} *(x - x_{1} ,\,y - y_{1} )dx_{1} dy_{1} \, $$
(5)

where * is the complex conjugate. By assuming that the local texture regions are spatially homogeneous and\( \mu_{mn} \) the mean and\( \sigma_{mn} \) the standard deviation of the magnitude of the transform, coefficients are used to represent the region for classification.\( \mathop \mu \nolimits_{{mn}} = \iint {|W_{{mn}} (xy)|dxdy{\text{ and }}\sigma _{{mn}} } = \sqrt {\iint {\left( {|W_{{mn}} (x,y)| - \mu _{{mn}} } \right)^{2} dxdy}} \). In our work, we have used four angular orientation and six scale factors with wavelet features.

2.1.3 Local Binary Pattern

Is a gray-scale and rotational invariant texture operator which characterizes the spatial structure of the local image texture [13]. In order to achieve gray scale invariance a unique pattern label has to be assigned to every pixel of an image based on the comparison of value of its binary pattern with its neighborhoods. The pattern label can be computed by

$${\user2{LBP}}_{P,\,R} = \sum\limits_{p = 0}^{P - 1} {s({\user2{g}}_{p} - {\user2{g}}_{c} )} {2}^{P}$$
(6)
$${\text{where}}\,s\left( {\mathop g\nolimits_{p} - \mathop g\nolimits_{c} } \right) = \left\{ \begin{aligned} 1,\,(\mathop g\nolimits_{p} - \mathop g\nolimits_{c} ) > &= 0 \hfill \\0,\,(\mathop g\nolimits_{p} - \mathop g\nolimits_{c} )< 0 \hfill \\ \end{aligned} \right.$$

g c is central pixel’s gray value having circular symmetric neighborhood\( \mathop g\nolimits_{p} (p = 0,\,1,\, \ldots ,\,P - 1) \),g p is neighbor’s gray value,P is the number of neighbors andR is the neighborhood radius. TheLBP P,R operator produces 2P different output values, corresponding to the 2P different binary patterns that can be formed byP pixel in the neighbor. During rotation of image,g p the gray value will corresponding move along the perimeter of the circle. Sinceg 0 is assigned as the gray value of element (0,R) to the right side of rotating a specific binary pattern naturally results in a different LBPP,R value. Therefore the rotation invariance is achieved by assigning a unique label to each rotation invariant binary pattern that is,

$$ \mathop {LBP}\nolimits_{P,\,R}^{ri} = \hbox{min} \left\{ {\left. {ROR\left( {\mathop {LBP}\nolimits_{P,\,R} ,\,i} \right)} \right|\quad i = 0,\,1,\, \ldots ,\,P - 1} \right\} $$
(7)

where\( ROR\left( {\mathop {LBP}\nolimits_{P,\,R} ,\,i} \right) \) performs a bitwise right shift in a circular way on theP bit numberLBP P,R i times. The uniform (U) value of\( \mathop {LBP}\nolimits_{P,\,R}^{ri} \) pattern is defined as the number of spatial transitions (bitwise 0/1 changes) in that pattern and is given by,

$$ U\left( {\mathop {LBP}\nolimits_{P,\,R}^{ri} } \right) = \left| {S\left( {\mathop g\nolimits_{P - 1} - \mathop g\nolimits_{c} } \right) - S\left( {\mathop g\nolimits_{0} - \mathop g\nolimits_{c} } \right)} \right| + \sum\limits_{p = 1}^{P - 1} {\left| {S\left( {\mathop g\nolimits_{P} - \mathop g\nolimits_{c} } \right) - S\left( {\mathop g\nolimits_{p - 1} - \mathop g\nolimits_{c} } \right)} \right|} $$
(8)

As per the recommendation in [13], Uniformity measure (U) of pattern ≤2, it is referred as uniform pattern and assigned with a label in the range 0 toP corresponding to the spatial transition. Other patterns withU > 2 are assigned to a labelP + 1. Then we have

$$\mathop {LBP}\nolimits_{{P,R}}^{{riu2}} = \left\{{\begin{array}{ll} \sum\limits_{{p = 0}}^{{P - 1}} {S(\mathop g\nolimits_{p} } - \mathop g\nolimits_{c} ),&{\text{ if}}\, U(\mathop {LBP}\nolimits_{{P,R}}^{{ri}} ) \le 2 \\ P + 1, & {\text{otherwise}} \end{array} } \right.. $$
(9)

The texture features from a pollen grain image is extracted using the above LBP operator. The LBP with radius (R) and pixel (P) are calculated for the entire image of sizeM × N is resulted in a labeled image. The labeled image is represented by histogram as,

$$ \mathop {H(k) \, = }\limits_{k \in 0,K} \, \sum\limits_{i = i}^{M} {\sum\limits_{j = 1}^{N} {f(\mathop {LBP}\nolimits_{P,R}^{riu2} (i,j)k)} } $$
(10)
$$ f\left( {\mathop {LBP}\nolimits_{P,\,R}^{riu2} (i,\,j),\,k} \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {\mathop {LBP}\nolimits_{P,R}^{riu2} (i,\,j) = k} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(11)

wherek is the maximal LBP pattern label.

2.1.4 Gray Level Co-occurrence Matrix

Texture feature calculations use the contents of the GLCM to give a measure of the variation in intensity at a pixel of interest proposed in [14] and they characterize texture using a variety of quantities derived from second order image statistics. Co-occurrence texture features are extracted from an image in two steps. First, the pair wise spatial co-occurrences of pixels separated by a particular angle and distance are tabulated using a gray level co-occurrence matrix (GLCM). Second, the GLCM is used to compute a set of scalar quantities that characterize different aspects of the underlying texture. The GLCM is a tabulation of how often different combinations of gray levels co-occur in an image. The GLCM is aN × N square matrix, whereN is the number of different gray levels in an image. An element\( p\left( {i,\,j,\,d,\,\theta } \right) \) of a GLCM of an image represents the relative frequency, wherei is the gray level of the pixelp at location (x,y), andj is the gray level of a pixel located at a distanced fromp in the orientation\( \theta \). While GLCMs provide a quantitative description of a spatial pattern, they are too unwieldy for practical image analysis. In [14] proposed a set of scalar quantities for summarizing the information contained in a GLCM. He originally proposed a total of 14 quantities, or features; however, typically only subsets of these are used [15]. In our work we considered five subset features of GLCM as shown in the Table  1.

Table 1 Five GLCM features

2.1.5 Gray Level Difference Method

The Gray Level Difference Method (GLDM) is based on the occurrence of two pixels which have a given absolute difference in gray level and which are separated by a specific displacement\( \delta \). For any given displacement vector\( \delta = \left( {\Updelta x,\,\Updelta y} \right) \), Let\( S_{\delta } (x,\,y) = |S(x,\,y) - S(x + \Updelta x,\,y + \Updelta y)| \) and\( D(i|\delta ) = {\text{Prob[}}S_{\delta } (x,\,y) = i ] \) be the estimated probability-density function. In this work four possible forms of the vector\( \delta \) will be considered\( \left( {0,\,d} \right),\,\left( { - d,\,d} \right),\,\left( {d,\,0} \right) \) and\( \left( { - d,\, - d} \right), \) whered is the inter sample spacing [16]. In this work we used four probability density functions for four different displacement vectors are obtained and the texture features are calculated for each probability density function.

As our interest is to study the statistics of texture features useful for pollen grain classification, from a pollen grain image, we used all the above feature extraction models for extracting the pollen grain surface texture. The extracted features are then classified using NN classifier.

2.2 Classification

Classification is to determine to which of a finite number of physically defined classes (such as different classes of pollen grains) of an unknown sample image of pollen grain belongs. In this work we use nearest neighbor (NN) classifier for the purpose of classification. It is a supervised learning method. In this classifier, to decide whether the sampleS i belongs to classC j , the similarity\( Sim\left( {S_{i} ,\,S_{j} } \right) \, \)or dissimilarity\( Disim(S_{i} ,\,S_{j} ) \)ofS i to all other samplesS j in the training set is determined. The n most similar training samples (neighbors) are selected. The proportion of neighbors having the same class may be taken as an estimator for the probability of that class and the class with the largest proportion is assigned to the sample Si.

3 Dataset and Experimentation

We have created our own dataset of 419 pollen grain images. Out of them around 50 image are collected from World Wide Web sources [1719], 100 images are collected from experts and around 269 images are collected using the standard procedures [4,20]. The images of dataset are across 18 classes of pollen grains. These 18 classes are irrespective of families. The dataset contain both LM (Light microscopic) and SEM (Scanning electron microscopic) images. The 18 classes of pollen grains are considered based on NPC (Number, Position and Character of aperture) classification system [20]. The 18 classes considered in this work are, (a) MC: Monocolpate, (b) DC: Dicolpate, (c) TC: Tricolpate, (d) TRC: Tetracolpate, (e) PC: Pentacolpate, (f) HC: Hexacolpate, (g) DCP: Dicolporate, (h) TCP: Tricolporate, (i) TRCP: Tetracolporate, (j) PCP: Pentacolporate, (k) MP: Monoporate, (l) DP: Diporate, (m) TP: Triporate, (n) TRP: Tetraporate, (o) PP: Pentaporate, (p) PPP: Polypantaporate, (q) NAP: Nonaperturate and (r) SAP: Spiraperturate. Sample examples of these classes are shown in Fig.  2.

Fig. 2
figure 2

18 classes of pollen grain based on NPC classification

The pollen images of 18 classes are kept in a database. The pollen images from the database are fed into different feature extraction models individually. We used 5th level decomposition of two dimensional discrete Debucies wavelet transform and extracted 15 features, therefore the feature vector comprises of 15 elements. Of the 15 elements the first 14 elements consists of the Average Intensity Value (AIV) of the matrices that are obtained when passed through high pass filters, which are the horizontal, vertical and diagonal details matrix obtained at each level, whereas the 15th element consists of the AIV of the approximate co-efficient matrix. In case of Gabor wavelet we used four different angular rotations 22.5, 45, 77.5 and 90° with six different scale factors 0, 2, 4, 6, 8, 10 and a total of 15 wavelet features with 4 rotation and 6 scale factors a total of 4 × 6 × 15 = 360 features are extracted. While using LBP we have extracted 256 features. In GLCM we used five subset features as shown in Table  1 and in GLDM we extracted the feature using 4 probability density functions.

All the above experiments are conducted using all 419 pollen grain images of 18 classes for 50, 60 and 70 % of training set. For classification we used five features and their possible combination. The obtained results are shown in Table  2. From Table  2 it is clear that Gabor wavelet perform better than other features and their combinations W + G, G + GC, and W + G + GC having same classification accuracy of 91.66 %. Table  3 shows total number of samples in each class, number of training and testing along with correctly and misclassified samples. The confusion matrix and F-measure graph for 70 % training and 30 % testing of each class are shown in Table  4 and Fig.  3 respectively.

Table 2 Classification accuracy of different combination of texture features under varying training sets
Table 3 Training, testing, correctly and wrongly classified samples
Table 4 Confusion matrix for pollen grain classification
Fig. 3
figure 3

F-measure for 18 classes of pollen grain

In our experiment Gabor wavelet features gave good result as the Gabor wavelet provides the optimal resolution in both time (spatial) and frequency domains.

4 Conclusion

The current work deals with the problem of pollen grain classification based on texture retrieval using different models like wavelet, Gabor, LBP, GLCM and GLDM with NN (Nearest Neighbor) classifier. As per the survey, earlier works deals with specific family of pollen grains for specific applications. This work mainly deals with the different classes of pollen grains irrespective of families. Classification of pollengrain using Gabor wavelet with NN Classifier gave better result compared to other models. We can extract other features like shape, contour along texture features using different feature extraction models for better features and use different classifiers other than Nearest Neighbor (NN) for further improvement in classification.