1 Introduction

Classifying a fruit is difficult and important task in supermarket and to govern its price cashier must know the category of a particular fruit. This problem is resolved by using barcodes for packaged products. However, some of the fruits cannot be packaged using barcodes and therefore each fruit can be assign by codes but this leads to error in pricing as well as tedious memorization of barcodes. One solution can be assigned by images and codes to the cashier, but flipping the booklet can be time consuming. Hence the classification of fruits using image processing and machine vision system is rapidly growing research area because it provides consequential amount of information about its attributes and the nature to maintain quality standard, reduces cost and deliver effective information at real time. Broadly, the fruit quality rely on exterior parameters: intensity, color, size, surface, shape appearance and interior parameters like acid and sugar contents but size and color is the most essential factor for sorting of fruits. From the past few years, distinct technique have been enhanced to grade and inspect the quality of fruits. Among all type of fruits, apples are one of the common fruits because the farming rate of apples with respect to all other fruits is more in India. The image processing based inspection of apple fruit is crucial for rising the agility of grading and excluding the human faults in the mechanism. In recent years in different fields [25, 35, 36, 40, 55] several effective solutions based on machine learning and computer vision have been proposed by researchers.

For this purpose Leemans et al. [32] introduced a method based on color information to detect defect using gaussian model on ‘golden delicious’ apples. Leemans et al. [33] introduced a method based on Bayesian classification process for defect segmentation on ‘Jonagold’ apples. Rennick et al. [45] compares five classifiers includes k-means, fuzzy c-means, k-nearest neighbor, multilayer perceptron and probabilistic neural network classifier on ‘Granny Smith’ apples. Yang [60] introduces a machine vision system for detection of defects on apples like patches. A flooding algorithm is used to segment dark patches then neural network approach is used. Leemans et al. [34] propose apple grading using quadratic discriminant analysis and neural network with multilayer perceptron. Kavdir and Guyer [26] use fuzzy logic for decision making support to grade apples by measuring the features using high tech sensors and machine vision. Blasco et al. [12] propose an apple grading using Bayesian discriminant analysis for segmentation while classifying the apples in batches and repeatability in blemish detection results in good performance. Kleynen et al. [29] proposed a method to select the best filters using quadratic discriminant analysis for detecting defects in ‘Jonagold’ apple fruit Unay and Gosselin [50] introduced an Artificial Neural Network (ANN) based automatic defect segmentation system for ‘Jonagold’ apples to eliminate false segmentation related to illumination artifacts.. Kleynen et al. [30] proposed a method for defect detection of ‘Jonagold’ apple based on Baye’s theorem using multi-spectral vision system. Leemans and Destain [31] proposed a method for defect detection of ‘Jonagold’ apple based on shape and texture feature extraction and linear discriminant classifier.

Unay and Gosselin [49] introduced ANN segmentation technique to grade apple fruit. Bennedsen et al. [7] reports a novel approach to locate the defects and eliminate other dark area based on rotating the ‘Pink Lady’ and ‘Ginger Gold’ apples. Xing et al. [59] report for detecting bruises on ‘Golden Delicious’ apples using principle component method, the wavebands were selected then moment thresholding was developed for classification. Xiaobo and Jiewen [57] uses near infrared machine vision and electronic nose system for detection of ‘Fuji’ apples. These three sensors are combined and uses ANN, multiple linear regression for grading of apples. Anderson et al. [2] introduce spatial frequency domain imaging for assessing the apple. They calculate scattering and absorption properties for bruise and non-bruise region. Bennedsen et al. [8] use infrared images to detect surface defects using multiple ANN and principle components at different wavelength and applying Weiner filter. Zhu et al. Zhu and Jiang [65] introduce apple quality inspection using gabor wavelet decomposition of NIR images with kernel Principle Component Analysis PCA for classification. Xiaobo et al. [58] propose a method based on genetic algorithm to grade ‘Fuji’ apples which shows more accurate result then capital SVM and BP-ANN. Vijayrekha Vijayarekha [53] introduced “multivariate image analysis” for defect detection of apples depending on decomposition from PCA. Wang et al. [54] proposed an apple grading using “vector median filter” with region growing segmentation and SVM.

Xiaobo et al. Xiabo et al. [56] developed an in-line detection of apple defect which is segmented using multi threshold. The defects include region of interest to grade the fruit. Radojevic et al. Radojević and Petrović [42] proposed a digital parameterization to measure size, shape and surface spottiness of apple fruit. Unay et al. [51] presents a novel method using machine vision for grading of apple fruit. The multi-spectral images are first segmented by minimal confusion matrix then geometric, statistical and textural features are extracted to classify the fruit. Moradi et al. [38] developed apple defect detection by using statistical histogram based Fuzzy c-means algorithm. Suresha et al. [48] proposed apple sorting using SVM classifier in which first RGB images to HSI then average red and green component are calculated for classification. Gopal et al. [21] presented the classifier based on probability density function on 187 apple fruits. Dubey and Jalal [16] detects and classify apples based on color and texture features using complete local binary patterns. Baranowski et al. [6] presented a method for early bruises detection in apples using thermal and hyperspectral imaging cameras by minimal noise fraction analysis of the apple fruit. Ghabousian and Shamsi [20] proposed a contour algorithm to segment the images. Arlimatti [4] proposed a window based method for grading the apples. Zhou et al. [64] developed a method of image fusion for apple surface quality detection. In this, images are collected by multi sensors and then fused by certain arithmetic system. Each image has improved operating factor. Khoje et al. [28] present a novel method for fruit grading based on discrete Curvelet Transform for feature extraction. Vani and Vinod [52] developed a non-destructive method to classify apple fruit using probabilistic neural network approach. Dubey and Jalal [17] used global Color Histogram, Color Coherence Vector, Local Binary Pattern (LBP) and CLBP, Zernike moment to grade the apples.

Khade et al. [27] proposed a defect segmentation based on texture feature using k-means clustering algorithm. Raihana and Sudha [44] presents a modified water shed segmentation method to detect the defected apple fruit using gray level covariance matrix based feature extraction. Ali and Thai [1] proposed a prototype of an automated fruit grading system to detect the defect of apple fruit. Moallem et al. [37] proposed a computer vision algorithm to grade ‘Golden delicious’ apple using multilayer perceptron neural network. Jawale and Deshmukh [23] proposed automatic evaluation of apple fruit disease using thermal camera and image processing bruises detection system. Bhargava and Bansal [9] reviewed various techniques for fruits and vegetables quality evaluation. Zhang et al. [62] presented a new deep learning-based method for removing haze from a single input image by estimating a transmission map via joint estimation of clear image detail. Yu et al. [61] presented a novel edge-based active contour model for medical image segmentation which guarantees the stability of the evolution curve and the accuracy of the numerical computation. Chen et al. [14] proposed a novel matting method based on full feature coverage sampling and accelerated traveling strategy to get good samples for robust sample-pair selection. Singh and Singh [47] presented a novel technique to grade apples by using different features like histogram of oriented gradients, Law’s Texture Energy, Gray level co-occurrence matrix and Tamura features. Bhargava and Bansal [10] proposed quality evaluation of mono and bi-colored apples with computer vision and multispectral imaging. Bhargava and Bansal [11] also presented an automatic grading and detection of multiple fruits by machine learning.

Among the literature surveyed for quality inspection of apple fruit using computer vision, researchers presented distinct methods that differ at quality categories taken, apple varieties tested, particular equipment used, imaging technique employed. As a result of this review, finding a commonly relevant basis to the bi-colored and mono-colored group, the used grading algorithm compare these works and conclude as followed: “Quality grading of apple fruit by machine vision is a burdensome task due to the variance of the problem. Thus, the research for a robust, generic and accurate grading system that works for all apple variations while respecting all norms of standards is still in progress” [51] .

The remaining paper is organized as: Section 2, describe computer vision algorithm that presented and build the apple fruit sorting system which includes explanation of image acquisition, segmentation, feature extraction and classification. We conferred the result of our experiments in Section 3, and the paper is concluded with research finding and feasible future work in Section 4.

2 Proposed APPLE fruit grading system

This section describes the apple fruit sorting system that we have built. In broad with increased expectation of high quality and safety in fruits, a computer vision based apple quality inspection system must achieve explicit segmentation and then perform correct grading of apples by extracting the statistical, textural, geometric features for classification by non-destructive method. Figure 1. shows the basic flow chart of the process.

Fig. 1
figure 1

Flow Chart of the proposed approach

2.1 Image acquisition

The first database (Set 1) is the ‘Golden Delicious’ apples consist of 100 images taken from EOS 550D digital camera acquire immense quality images with a size of 3456 × 2304 pixels and a resolution of 0.03 mm/pixel. An inspecting chamber (camera and lighting system) is created where samples are placed. The distance between camera and sample is 20 cm and are created by Blasco et al. [12]. The second dataset (Set 2) of 112 images of 40 apples (RGB color space) with different angle by Redmi Note 5 mobile phone with a size of 4000 X 3000 pixels. The distance between phone and apple is 1 m. The third dataset (Set 3) of 4359 images of fresh and rotten red delicious/ golden delicious/ granny smith apples are taken from kaggle.com created by Sriram Reddy Kalluri [24]. Dtataset used by our algorithm is shown in Table 1. Figure 2. shows some samples of database images used.

Table 1 Characteristic of dataset images
Fig. 2
figure 2

Sample of dataset images used as a Healthy Golden Delicious b Defected Golden Delicious c Healthy apple Created by author d Defected apple Created by author e Healthy Mix f Defected Mix

2.2 Segmentation

Image acquired by camera abide of various noises which degrade the appearance of image. Hence, it cannot assign convenient data for processing of image. So, enhancement of image is done by adjusting the image intensity value or color map. European commission marketing standard [3] for apples defines category for quality which requires defect information.

Therefore, it is necessary to specifically segment the defect which is ambiguous because size, type, texture and color vary for different defect. In this work, after preprocessing image segmentation is required for separating the area during the evaluation.

This division is done using different techniques: converting RGB image to grayscale image, “Otsu” strategy and k- means clustering. The k-means clustering detects the degraded part of apple and partition the intensity values into k number of regions by reducing the distance between corresponding regions and objects. The value of k is chosen by elbow method. Firstly, determine Within-Cluster-Sum of Squared Errors (WSS) for various values of k. Choose the k for which WSS becomes first start to diminish. Here, k = 3 is chosen as an elbow. Some of the segmented images are shown in Fig. 3 using Otsu’s and k-means clustering method.

Fig. 3
figure 3

Steps in Preprocessing and segmentation: a Resize image (256 X 256) b Contrast Enhanced image c Otsu Segmentation d k-means segmented image

2.3 Feature extraction

Segmentation results in definite unconnected objects with different shape and size. Each object handles separately or together for correct decision of fruit. Table 2 summarized the different combination of features extracted from the defective segmented area. Our experiment shows that 13 statistical & textural, 14 geometric features, Gabor wavelet and color coherence features are used for best performance in grading system.

Table 2 Features extracted combination

2.3.1 Statistical features

The probability of first order measure observing the random gray pixels values are called statistical features which includes mean, standard deviation, variance, smoothness, RMS, inverse difference moment, kurtosis and skewness. These features do not take contingent relations of gray values into account.

2.3.2 Textural features

The probability of second order measure as pixel pairs are called textural features which includes contrast, correlation, energy, homogeneity and entropy. Pattern recognition widely uses geometric moment’s textural features. Another prominent group of textural features includes gray-level co-occurrence matrices (GLCM) [39, 43] which shows number of occurrences for gray level pairs as a square matrix. The GLCM features, inverse difference moment (IDM) is related to smoothness while variance and contrast are assessment of local variations. A GLCM is defined as a matrix (M X N) gray level image I, parameterized by an offset [22], defined as:

$$ {C}_{\Delta x,\Delta y}\left(i,j\right)={\sum}_{p=1}^N.{\sum}_{q=1}^M\left\{\begin{array}{c}\ 1,\kern0.5em if\ I\left(p,q\right)=i\ and\ I\left(p+\Delta x,q+\Delta y\right)=j\\ {}0,\kern0.5em otherwise\end{array}\right. $$
(1)

where I(p, q), gray level of image I at pixel (p, q).

2.3.3 Geometrical features

Also, features can be extracted for recognition depending on geometry of object. Nonetheless, defects of apple cannot have peculiar characteristics. Hence, we inked the geometric features that includes solidicity, area, and maximum length of area, eccentricity and perimeter listed in Table 3.

Table 3 Geomtrical features based on shape

2.3.4 Gabor wavelet feature

Gabor wavelet is invented by Dennis Gabor using the complex function to minimize the product of its standard deviation in time and frequency domain. Mathematically,

$$ f(x)={e}^{-{\left(x- xo\right)}^2/{a}^2}{e}^{- ik0\left(x-x0\right)} $$
(2)

The Fourier transform of Gabor Wavelet is also a Gabor Wavelet given by:

$$ F(k)={e}^{-{\left(k- ko\right)}^2/{a}^2}{e}^{- ix0\left(k-k0\right)} $$
(3)

2.3.5 Discrete cosine transform (DCT)

DCT is a powerful transform to extract proper features. After applying DCT to entire image, some of the coefficients are selected to construct feature vectors. Most of the conventional approaches select coefficients in a zigzag manner or by zonal masking. The low frequency coefficients are discarded in order to compensate illumination variations.

2.4 Classification

The essential feature for fruit grading is classification. By using sufficient number of training samples, the grading is done. In this experiment, we use statistical classifier known as SVM. It is a type of learning system which uses hypothesis space in higher dimensional space of linear function which implements statistical learning theory. It consist of two parts: Linearly separable, Non-linear separable.

2.4.1 Linearly separable

SVM is used for grading purpose which is formerly proposed for 2-class problems. SVM is a supervised learning method which is based on minimization procedure of structural risk [13]. It calculates the hyper plane which separates the classes with maximum margin for binary values. For preventing the biased to sample order before being introduced to classifier, samples are randomly ordered. Assuming two training of hollow and solid dots, H is hyper plane optimized, H1 & H2 (support vectors) are samples whose distance is minimum.

To find a linear function, linear regression defined by

$$ f(x)=\mathit{\operatorname{sgn}}\left({w}^{\ast }.x+{b}^{\ast}\right)=\mathit{\operatorname{sgn}}\left({\sum}_{i=1}^N{a}_i^{\ast }{y}_i\left({x}_i.x\right)+{b}^{\ast}\right) $$
(4)

where ai is Lagrange multipler, b is threshold, yi indicating the class to which point belongs.

SVM uses feature mapping where new space is separable which is done by kernel functions defined by:

$$ \mathrm{Polynomial}\ \mathrm{kernel}\ \mathrm{function}\ K\left(x,{x}_i\right)={\left(\left(x.{x}_i\right)+1\right)}^q $$
(5)
$$ \mathrm{RBF}\ \mathrm{kernel}\ \mathrm{function}\ K\left(x,{x}_i\right)=\exp \left(-\frac{\left|{\left(x-{x}_i\right)}^2\right|}{\sigma^2}\right) $$
(6)
$$ \mathrm{Linear}\ \mathrm{kernel}\ \mathrm{function}\ K\left(x,{x}_i\right)=1+x{x}_i+x{x}_i\min \left(x,{x}_i\right)-\frac{x+{x}_i}{2}\min {\left(x{x}_i\right)}^2+\frac{1}{3}\ \mathit{\min}{\left(x{x}_i\right)}^3 $$
(7)

2.4.2 Non-linearly separable

SVM uses non-seperable by introducing positive slack variables ξi, i = 1,2,…,l, in the constraints [15], which then become

$$ {x}_i.w+b\ge +1-{\upxi}_i, for\ {y}_i=+1 $$
(8)
$$ {x}_i.w+b\le -1+{\upxi}_i, for\ {y}_i=-1 $$
(9)
$$ {\upxi}_i\ge 0,\forall i $$
(10)

2.5 Evaluation

After extracting the different combination of features, the whole database is divided into two parts: Training set and testing set. The training set is used to train the system for classification, concurrently testing set is used to test the completion of classifier. The generation capability of the classifier are enhanced by using cross validation method on training dataset. This method generally employed to estimate the statistical applicability of the classifiers. It consist of four [41] types: Monte Carlo cross validation, Random subsampling, leave-one-out validation and k-fold cross validation. The k-fold cross validation is used because of its simplicity. In this method, K complementary subsets are partitioned from the dataset from which training is done for k-1 subset and validation is done by one subset. The whole process repeats K times by using every subset once for validation. The mean value of the results are computed for the final estimation. It can be observed that by using k-cross validation process all observations are used for both training and testing and each observation is used for validation exactly once. Figure 4 shows the simplified diagram of 5 –fold cross validation process.

Fig 4
figure 4

A five-fold cross validation

The predication performances of the classifiers uses following measures: accuracy, sensitivity and specificity. Accuracy is calculated as all correct overall (true positive & negative) classifications. Sensitivity is also known as recall or true positive rate and is the probability of detection of undesirable objects which is correctly classified. Specificity is also known as true negative rate and is the probability of detecting the sound kernel correctly.

$$ Accur\mathrm{a} cy\ \left(\%\right)=\frac{TP+ TN}{TP+ TN+ FP+ FN}\ X\ 100\% $$
(11)
$$ Sensitivity\ \left(\%\right)=\frac{TP}{TP+ FN}\ X\ 100\% $$
(12)
$$ Specificity\ \left(\%\right)=\frac{TN}{TN+ FP}\ X\ 100\% $$
(13)

where TP, TN, FP and FN represent true positive, true negative, false positive and false negative of confusion matrix respectively.

3 Results and discussions

European commission marketing standard [3] for apples defines one reject and three acceptable quality category. Nonetheless, wide literature consist of healthy/defective grading due to adversity of collection of database and grading processes. In the pursuit of detection of best feature of apple two category sorting, we have inspected some statistical, textural and geometrical features while Support Vector Machine (SVM) is used for classification. The three dataset of apples are 100 Golden delicious (74 healthy, 26 defected) for Set 1 and 112 Normal (56 healthy, 56 defected) for Set 2 and 4359 (2017 healthy, 2342 defected) for Set 3 has trained with different combination (12/18/24/30) of features. Note that, each feature set selected consist of combination of statistical, textural and geometrical features. We implement fruit sorting with SVM using different kernel functions (linear, quadratic, polynomial, RBF, MLP) at a time as shown in Fig. 5.

Fig. 5
figure 5

Analysis of classification of apple fruit

It can be seen that by using different kernel function best performance is obtained when linear SVM is used among other kernel functions. Highest recognition rate for minimum combination (12 feature) with linear SVM for k = 10 are 72.00% (Set 1), 73.13% (Set 2) and 75.38% (Set 3). To improve the recognition rate more features are selected (18 feature) with linear SVM for k = 10 are 82.00% (Set 1), 79.16% (Set 2) and 79.38% (Set 3). To make system robust more features are combined (24 features) with linear SVM for k = 10 are 84.00% (Set 1), 87.91% (Set 2) and 86.23% (Set 3). Statistically, as examine increasing more features (30 feature) highest recognition rate achieved with linear SVM for k = 10 are 93.00% (Set 1), 96.81% (Set 2) and 95.41% (Set 3)as shown in Table 4. It can be observed that by selection of combination of features and increasing the number of features the accuracy increases. Also, using different kernel function best performance is obtained when combination of 30 features are incorporated. While increasing number of combination of features above 30 the accuracy rate is observed to be decreasing. Therefore the proposed combination of features are advantageous for two-category apple grading. It can be found that linear SVM is superior to all other kernel function and is best for apple recognition. Figure 6 displayed the results within each plot, SVM with different kernel function is taken at y-axis and accuracy rate obtained using the test images is represent at the x-axis. In each plot, it can be observed that the performance combination of all the features is far better than the performance of small set of features.

Table 4 SVM based fruit quality grading results using different kernel functions (in percentage)
Fig. 6
figure 6

Accuracy rate for a Set 1 with k = 5, b Set 1 with k = 10, c Set 2 with k = 5, d Set 2 with k = 10, e Set 3 with k = 5, f Set 3 with k = 10 with linear, quadratic, polynomial, RBF, MLP kernel functions

Sensitivity and specificity give an indication of how well the sound kernels were classified. As classification accuracy only indicates the presence of errors, one may prefer to describe the model in terms of sensitivity and specificity to better describe the models. The lower sensitivity and specificity indicates the classification errors and the higher values indicates the perfectly classified classes. The False Positive Rate (FPR) and False Negative Rate (FNR) were used to measure the errors done by method. The minimum and maximum value of FPR are 3.34% and 58.34% and of FNR are 12.51% and 65.00% respectively.

3.1 Comparison to other methods

Among different works popularized in the literature, few of them attains our attention because of apple fruit defect detection system as summarized in Table 5. Seng and Mirisaee [46] uses 55 apples fruit by extracting geometric features and classifying by KNN achieved 90.00% recognition rate. Ashok and Vinod [5] uses 65 apples images by extracting geometric features and classifying by PNN achieved 86.52% recognition rate. Moallem et al. [37] uses golden delicious apple database [12] consist of 120 (80 Healthy/ 40 Defected) apple color images which results in accuracy of 89.20% with 16 features. The proposed system also uses same database [13] but it consist of 100 (74 Healthy/ 26 Defected) which results in accuracy of 93.41% with 30 features. Both the database are downloaded from the same link but the researcher found 100 images instead of 120 (as mentioned by [37]). The obtained accuracy of the proposed system is compatible with Khan et al. (2019). However, there is an improvement of 3.91% from 92.90% to 96.81% is encouraging and satisfactory. Comparative analysis of the proposed system with other existing techniques shows improved and better accuracy with two different datasets. Hence, our approach contributes to improved recognition with cascaded features and a suitable classifier.

Table 5 Comparative Analysis for Sorting of Apple Fruit

The accuracy obtained by the proposed system is better as compared to the available system. The system performance will be improved by taking more combination of features.

3.2 ROC curve

The receiver operating characteristic curve (ROC) analysis is a tool to classify the implication of the results contributed by classification and identification systems. The true positive, true negative, false positive and false negative and the various tuning parameters results in the outcome range for ROC. It is the plot of relation between true positive rate (TPR) and false positive rate (FPR). Figure 7 shows the ROC curve for our grading algorithm for k = 10 using linear function Set 1, Set 2 and Set 3.

Fig. 7
figure 7

Receiver operating characteristic curve for a Set 1with k = 5, b Set 1 with k = 10, c Set 2 with k = 5, d Set 2 with k = 10 e Set 3 with k = 5 (f) Set 3 with k = 10

4 Conclusion and future work

In this research, a computer vision based apple fruit grading and sorting is introduced. Firstly the area of fruit is extracted from background and is segmented by k-means clustering algorithm. After segmentation, defected region are extracted from multiple features which is fed to SVM for fruit grading by binary (healthy or defected) classification. Our apple fruit sorting results shows that different combination of different features acquired more accuracy compared to state of art. The maximum accuracy of 96.81% with 30 features and linear SVM classifier is encouraging and much better in comparison to another state of art technologies. Results showed that the proposed system with linear SVM classifier has better performance as compared to other kernels of SVM classifier.

Finally, the experimental results validations support the observations as follows: if features are less, one should prefer the combination of different features, improved recognition rate can be achieved with different features. However, the future work may further be improved by considering large number of apple images, different segmentation techniques, more significant features and combination of classifiers techniques. In the future, a more generalized and robust system with improved performance may be worked upon. The solution proposed in this work must be tested for multiple apple images to access its generality, robustness and accuracy and must be installed in a real life apple sorting environment. Also, the idea of the proposed system may be extended using convolution neural network [18, 19, 63].