Abstract
Chickpea is an important edible legume that can be grown in rain fed conditions. Image analysis and machine learning could be used for rapid and non-destructive determination of seed physical attributes and such techniques yield objective, accurate and reliable results. In this study, size, shape, and area attributes of 26 different chickpea cultivars were determined by image processing method, and color properties were determined by chromametric method, and machine learning algorithms (Multilayer Perceptron-MLP, Random Forest-RF, Support Vector Regression-SVR, and k-Nearest Neighbor-kNN, were used for mass prediction of chickpea seeds. Ilgaz and Çakır cultivars had the highest size and shape values, while İzmir and Sezenbey cultivars had the highest color attributes. Compactness (in horizontal orientation) had a positive correlation with the equivalent diameter (in vertical orientation) and elongation (in vertical orientation) (r = 0.99 for both parameters). Besides, a* had a high correlation with b* (r = 0.97). According to Euclidean distances, Akça–İnci and Damla–Işık cultivars were identified as the closest cultivars in terms of physical attributes. In PCA analysis, PC1 and PC2 explained 73.17% of the total variation. The PC1 included length, geometric mean diameter, volume and surface area, and the PC2 included roundness (in horizontal orientation), thickness, elongation (in horizontal orientation) and sphericity. RF and ML had successful results with the values of 0.8054 and 0.8043 for train-test split, and 0.8231 and 0.8142 for k-fold cross validation, respectively. Present findings revealed that texture image processing and machine learning could be used as an effective and inexpensive discrimination tool for chickpea seeds.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Chickpea seeds are quite rich in protein and carbohydrates, thus largely consumed worldwide. Chickpea, as an edible legume, has a significant place in the daily diets of low-income countries and is largely grown in the Mediterranean countries, Asia, Africa, and Europe (Sastry et al. 2019). Chickpea seeds contain 29% protein, 59% carbohydrate, 5% oil, 4% ash, and 3% fiber (Varol et al. 2020; Iqbal et al. 2006). Chickpea seeds were reported to be effective in the prevention of various diseases such as obesity, colon cancer, diabetes, and cardiovascular diseases (de Camargo et al. 2019). Annual chickpea production was reported as 14.776.827 tons in the world and 470.000 tons in Turkey (FAO 2019).
Shape and size attributes of legumes offer quite reliable data for transportation, classification (Bacci et al. 2002), discrimination (Symons et al. 2003), packaging and storage systems (Öztürk and Kara 2009). Such attributes are also used in the design of planting systems of sowing machines (Sayinci et al. 2012), the design of harvesting systems. Color parameters play a great role in consumer demands and culinary preferences (Cetin et al. 2020). Chickpea seeds have great potential in the seed and food industry. However, for the economic and efficient use of seeds, knowledge is required about seeds physical properties (Ropelewska et al. 2022). Based on seed color and geographical distribution, chickpeas are divided into two types desi (India-originated) and kabuli (Mediterranean and Central Asia-originated) types. Kabuli chickpeas are large with white and cream seed coat colors (Chavan et al. 1987; Ozaktan 2021).
Both destructive methods (RAPD analysis- random amplification of polymorphic DNA, multisensory gas analysis) and non-destructive methods (imaging, spectral reflection, vibration sensors, and excitation systems) are employed in the identification of different products. In destructive methods, samples pass through different processes in the laboratory for identification (Abbaspour-Gilandeh and Azizi 2018; Ropelewska and Jankowski 2019; Fathizadeh et al. 2021). Besides the complexity and time-consuming nature of destructive methods, other limitations arise, the most important of which is the possibility of damaging the sample (Abbaspour-Gilandeh and Azizi 2018). Therefore, image analysis operations have successfully been employed as an inexpensive, accurate, and reliable approach for the assessment of seeds of different species (Ropelewska and Jankowski 2019; Sabanci et al. 2021; Ropelewska 2020).
Physical measurements for seeds are time-consuming and laborious. Novel technologies have been developed to overcome these problems. Development technologies could be detected, sorted, and classified simply and quickly (Soares et al. 2013). However, such practical approaches are needed to define the characteristics used in the quality assessment of seeds (Omid et al. 2010). Image processing and machine learning methods are powerful tools for describing the characteristics of seeds. The shape and size attributes of the seed are revealed in a short time with the image processing technique (Çetin et al. 2020). Machine learning offers nonlinear models that can predict past and future values within an input–output link. (Zhang et al. 2012). Several researchers previously investigated the shape, area, size, and color attributes of chickpea and legume seeds (Masoumi and Tabil 2003; Nikoobin et al. 2009; Simonyan et al. 2009; Eissa et al. 2010; Abou-Salem and Abou-Arab 2011; Güzel and Sayar 2012; Kibar et al. 2014; Jogihalli et al. 2017; Sastry et al. 2019; Cetin et al. 2022). In these studies, principal physical attributes such as length, thickness and width were generally measured with a caliper and with the use of these properties, area, diameter, volume, and shape attributes were calculated. Measuring relevant attributes with conventional methods significantly increases the time and labor required (Çetin et al. 2022). However, novel approaches such as texture image analysis and spectrophotometric analysis would be carried out for such purposes. In addition, the discriminant analysis would be conducted with the use of principle component analysis (PCA) and hierarchical cluster analysis (HCA) methods. Instrumental and spectrophotometric analytical methods provide significant savings from the required labor and time. PCA is used to determine the attributes of cultivars, evaluate correlations among the investigated traits, then identify superior cultivars for certain traits. HCA offers reliable discrimination and grouping of cultivars (Cetin et al. 2020).
Mass prediction of the seed from the easily determining shape and size properties may support the design of sorting machines and classification processes (Gonzalez et al. 2017). Many researchers reported that machine learning yielded better performances for the prediction of agricultural product characteristics (Moosavi and Sepaskha 2012; Demir et al. 2017; Kus et al. 2017; Shabani et al. 2017; Çetin et al. 2021). There are limited number of studies on seed mass prediction (Soares et al. 2013; Rad et al. 2015, 2017; Gurbuz et al. 2018; Demir et al. 2020; Sağlam and Çetin 2021). However, there are no studies on the mass prediction of chickpeas with the use of image processing and machine learning algorithms. This study was carried out to (I) determine the differences in seeds of 26 chickpea cultivars grown under the same ecology without using any chemical inputs, (II) detect shape and size attributes of chickpea seeds with image analysis technique, (III) reveal these data through hierarchical clustering analysis and principal component analysis and to compare the relationships between each other and (IV) different machine learning algorithms (MLP, RF, SVR, and k-NN) were used for mass prediction from physical attributes and performance of these algorithms were compared.
Material and method
Study area and sample preparation
In the present experiments, 26 kabuli chickpea cultivars (Akça, Akçin 91, Aksu, Aziziye 94, Azkan, Cevdetbey 98, Çagatay, Çakır, Damla 89, Dikbas, Er 99, Gökçe, Gülümser, Hasanbey, Ilgaz, Işık 05, İnci, Izmir 92, Menemen 92, Osmanbey, Sarı 98, Seçkin, Sezenbey, Uzunlu 99, Yaşa 05 and Zuhal) were used as the plant material. Osmanbey is not a standard cultivar, it is a genotype.
Experiments were conducted on experimental fields of the Agricultural Research and Implementation Center (38°42′54.2"N–35°32′42.4"E) of Erciyes University in the years 2018 and 2019 in randomized blocks design with 3 replications. Each plot had 6 rows 30 cm apart with 5 cm on-row plant spacing. Conventional agronomic practices were not conducted on experimental fields during the last 3 years. Following the manual sowing, sprinkler irrigation was practiced until the emergence and the first manual weed control, then drip irrigation was practiced through drip lines placed by each row. Climate parameters (monthly average temperature, °C; relative humidity, %; precipitation, mm) of the experimental years and long-term (1931–2019) averages. Soil samples were taken from 0–30 cm soil depth at different points to represent the entire study area. Air-dried samples were passed through 2 mm sieves and subjected to soil analyses.
Image texture analysis of chickpea seeds
In the present study, 100 seeds were used in one image. Three replications were performed for each cultivar. A total of 600 seeds were used for each cultivar, as 300 seeds (1 variety × 3 replications) in the year of 2018 and 300 seeds in the year of 2019. Seed mass (M, g) was determined with a precise balance (± 0.001 g). The image acquisition system was composed of a digital CCD (Charge-Coupled Device) camera, lens, macro flash, lighting equipment, and imaging platform. Images were taken in a dark environment to get shadow-free images (Sayinci et al. 2012; Kara et al. 2013).
Chickpea seeds were placed on gray cardboard paper in two different orientations (horizontal and vertical) with the use of double-sided adhesive tape. Images were taken from a vertically positioned camera at a fixed height of 50 cm. To provide dimensional calibration, the unit conversion ratio of the images was obtained in pixels/mm by using a ruler on a fiberglass plate. A representative scheme of the imaging setup given in Fig. 1. Captured images were transferred to a computer and saved as *.tiff files for descriptive analysis. Length (L, mm), width (W, mm), thickness (T, mm), projected area (PA, mm2), equivalent diameter (ED, mm), perimeter (P, mm), shape factor (SF, mm) and compactness (C, %) were determined in both horizontal and vertical orientations with the use of SigmaScan®Pro 5.0 and MATLAB software. Equations used to calculate volume (V, mm3), shape index (SI), roundness (R), geometric mean diameter (Dg, mm), surface area (S, mm2), elongation (E) and sphericity (φ, %) (Sayıncı et al. 2015; Demir et al. 2018). Figure 2 shows an example image through the entire segmentation process. As a first step, the background of the original image has been removed. In the second step, the gray-scale transformation was performed. In the third step, the gray-scale image was inverted. Finally, the remaining background pixels were removed according to the morphology operator and objects with less than 100 pixels were removed. The final value was chosen after trial and error.
Chromatic characteristics
Color measurements of chickpea seeds of each cultivar were made with a chroma meter (Konica Minolta CR-400, Japan). Measurements were made in CIELab color space. L* (brightness, 0 dark, 100 light), a* (+ values red,—values green), and b* (+ values yellow,—values blue) values were measured. The following equations were used to calculate the color index (CI), hue angle (h *), and chroma (C *) values from the measured L*, a*, and b* values (McGuire 1992).
Principal component and hierarchical cluster analysis
Principle component analysis (PCA) is generally used to identify superior cultivars for certain trait/traits, to reduce large datasets, and to make some cultivars prominent for some traits. Cluster analysis allows the grouping of samples based on similarities between each other. PCA and hierarchical cluster analysis were conducted with the use of Past3 software. For hierarchical cluster analysis, Euclidean square distance, Ward’s method, and PCA correlation matrix were used (Demir et al. 2018).
Machine learning modeling
The prediction of seed mass was performed using WEKA machine learning software (Machine Learning Group, University of Waikato) (Witten and Frank 2005). The Correlation-based Feature Selection (CFS) subset evaluator and Best First algorithm were utilized to determine the attributes with the higher power to predict seed mass (Sağlam and Çetin 2021). The model development was conducted on various datasets (inputs) including shape and size attributes such as cultivar, SFh, PAh, EDh, Dg, φ, T, SFv, PAv, and EDv. A total of 3000 data were used, 300 from each attribute. The models were built by a test mode of tenfold cross-validation, train (70%)-test (30%) split, and various algorithms from groups of Multilayer Perceptron (MLP), Random Forest (RF), Support Vector Regression (SVR) and k-Nearest Neighbors (k-NN). In this study, in k-NN algorithm, the Euclidean distance rule was utilized in the searching process, and k value was chosen as 1, 3, and 5 for mass prediction. Pearson VII (PUK) kernel function was chosen SVR. In the prediction of seed mass, the number of inputs, the number of neurons in the hidden layer and the number of outputs were 10-10-1 ANN structure. In all MLP prediction, number of epochs were chosen as 1000, the learning ratio was 0.3 and the momentum coefficient was 0.1, and the Sigmoid was chosen as the activation function. The applied MLP model structure is provided in Fig. 3. Model performance was assessed by following statistical metrices, correlation coefficient (R), mean absolute error (MAE), root mean square error (RMSE), relative absolute error (RAE) and root relative squared error (RRSE) for estimation were calculated using Eqs. (7–11) (Parker 2001).
where n: Number of data, Mi: Measured target value, Ei: Estimated target value, Ṁ: Average of measured target values, Ė: Average of estimated target values, SE: Sum of estimated target values, SM: Sum of measured target values. Correlation coefficients (R) were analyzed to assess the success of estimations in accordance with the principles specified in Colton (1974). Correlation coefficients of between 0–0.25 indicate no correlation or slight correlations, 0.25–0.50 indicate moderate correlation, 0.50–0.75 moderate or high correlation; 0.75–1 indicate perfect correlation.
Statistical analysis
Physical and color parameters were analyzed with the use of JUMP 13.2.0 software, image analyses of physical properties with SigmaScan®Pro 5.0 (2004) and MATLAB software, hierarchical clustering, and principal component analysis (PCA) with PAST software. The mean values of each parameter were subjected to the Shapiro–Wilk (W) test and after it was determined that the values showed normal distribution, analysis of variance was performed, and significant means were compared using the Tukey test (Düzgüneş et al. 1987).
Results and discussion
Physical attributes of chickpea cultivars
Mean values and Tukey groups of variance analysis for physical attributes of chickpea cultivars are given in Table 1. The mean values of seeds for both years (2018 and 2019) are given in the tables. As can be inferred from Table 1, the effects of years (Y) on seed mass, length and perimeter were not found to be significant. However, the effects of cultivars (G) and YxG interactions on mass, volume, length, thickness, geometric mean diameter, surface area and equivalent diameter (vertical) were found to be significant at a 1% level.
The highest seed mass was obtained from Ilgaz cultivar (0.49 g) and the lowest from Menemen 92 and İzmir 92 cultivars (0.38 g). The highest volumes were obtained from Sarı 98, Ilgaz, Çakır, Cevdetbey and Aksu cultivars, respectively and the lowest from Menemen 92 cultivar. The greatest length was measured in Çakır cultivar (12.13 mm) and the lowest in Menemen 92 cultivar (10.13 mm). In terms of seed width, Cevdetbey, Sarı 98, Aksu, Ilgaz cultivars had the highest values, while İzmir 92 and Menemen 92 cultivars had the lowest values. The highest thickness was obtained from Aziziye 94 cultivar (9.29 mm). Geometric mean diameter values varied between 9.00—9.96 mm.
Nikobin et al. (2009) reported seed lengths of chickpeas with different moisture content as between 8.46–9.71 mm. Sastry et al. (2019) reported length values (between 6.4–0.6 mm) lower than the present ones. Masoumi and Tabil (2003) reported the average seed width of large, small and desi types respectively as 8.35, 7.06 and 6.46 mm. Seed thickness and mass for kabuli type were reported as 8.25 mm and 0.507 g. Simonyan et al. (2009) reported seed widths of 'Rongai' cultivar as between 7.39–8.07 mm and thickness values as between 5.39–5.88 mm. Geometric mean diameter values of 'Rongai' cultivar varied between 7.17–7.90 mm. Çetin (2022) reported volumes of 5 different chickpea cultivars as between 76.38–232.77 mm3 and seed mass values as between 0.09–0.29 g. Present findings on seed volumes comply with the results of Masoumi and Tabil (2003), Simonyan et al. (2009) and Sastry et al. (2019).
The greatest projected area (vertical orientation) was obtained from Çakır (82.60 mm2) cultivar and the lowest from İzmir 92 (65.42 mm2) cultivar. The projected area (horizontal orientation) values changed between 63.46 and 81.44 mm2 and differences between the years were found to be significant (p < 0.05) and the average value was identified as 76.12 mm2 in the first year and 71.35 mm2 in the second year. The highest surface area (313.21 mm2) was obtained from Ilgaz cultivar and the cultivars Sarı 98 (312.32 mm2), Çakır (312.04 mm2), Cevdetbey 98 (309.83 mm2) and Aksu (306.26 mm2) were also placed into the highest group and the lowest value was obtained from Menemen 92 (255.82 mm2) cultivar. In vertical orientation, the greatest perimeter was obtained from Çakır (44.74 mm) cultivar and the lowest from Menemen 92 (33.48 mm) and İzmir 92 (33.86 mm) cultivars. While Seçkin (41.58 mm) cultivar had the greatest perimeter in horizontal orientation, the lowest was obtained from Menemen 92 (32.57 mm) cultivar. For perimeter, differences between the years were not found to be significant. Among the chickpea cultivars, Çakır (10.22 mm) cultivar had the highest equivalent diameter in vertical orientation and Zuhal (10.62 mm) cultivar had the greatest equivalent diameter in horizontal orientation.
Masoumi and Tabil (2003) reported the average projected area of large-type chickpeas as 66.09 mm2. The surface area is closely related to evaporation from the seed surface. George et al. (2007) stated that drying rates increased with the increasing surface area of the seed. Similar to the present findings, Kibar et al. (2014) reported the average surface area of bean seeds as 200.62 mm2 in non-irrigated areas and 239.99 mm2 in drip-irrigated areas. Çetin (2022) reported the projected area and surface area of 5 soybean cultivars respectively as between 21.76–45.75 mm2 and between 87.06–182.99 cm2.
The variance analysis results, mean values and Tukey groups for shape and color attributes of chickpea cultivars are provided in Table 2. Effects of YxG (1%) interactions on compactness (Ch), shape factor (SFv and SFh), a* and b* parameters were found to be significant. Effects of years on sphericity, shape index, elongation (Ev) and b* were also found to be significant at 1% level. Compactness (Cv) values varied between 16.56–29.12 and all cultivars were placed into the same statistical group. The greatest sphericity value (89.04%) was obtained from Menemen 92 cultivar and the lowest from Çakır (82.48%) cultivar. Among the chickpea cultivars, Çakır (1.35) had the highest shape index value. For roundness (in vertical orientation), Aziziye cultivar (0.84) was prominent, while Azkan cultivar (0.69) had the lowest value.
The lowest roundness (horizontal orientation) value (0.66) was obtained from Zuhal cultivar, while the highest value (0.79) was obtained from Damla cultivar. A roundness value of close to “1” indicates a circular seed shape (Çetin et al. 2020; Çetin 2022). Consistent with present findings, Simonyan et al. (2009) reported the equivalent diameter values of 'Rongai' chickpea cultivar at 9.7, 21.1, 23.9 and 29% moisture contents respectively as 7.26, 7.77, 7.83 and 8.01 mm. Sastry et al. (2019) reported mean sphericity values of desi, kabuli and intermediate-type chickpea seeds as 79.5, 85.7 and 84.5%, respectively. Those values were higher than the present ones. Nikobin et al. (2009) indicated that the sphericity values of chickpea seeds with different moisture content varied between 86.69–88.50% for 'Arman' cultivar and between 86.30–87.65% for 'Hashem' cultivar. Eissa et al. (2010) reported the sphericity value of chickpea seeds as 85.53% for 'Giza 3' cultivar and 87.00% for 'Giza 195' cultivar. Simonyan et al. (2009) reported sphericity values of Rongai' cultivar at different moisture contents as between 77.4–76%. Çetin (2022) reported average sphericity, shape index and roundness as 88.47%, 1.20 and 0.78, respectively. The mean aspect ratio and elongation of soybean seeds were also reported as 0.78 and 1.30, respectively.
While Azkan (1.35) cultivar had the highest elongation in the vertical orientation, the lowest average was obtained from Aziziye 94 (1.18) cultivar. The highest average elongation in horizontal orientation was obtained from Çakır cultivar (1.35) and the lowest values were obtained from Damla 89 (1.19), Menemen 92 (1.20) and Çağatay (1.21) cultivars. While classifying seeds based on shape ratio, the ones with a shape ratio of less than 1.25 are classified as round and the ones with a ratio of greater than 1.25 are classified as round (Çetin et al. 2020; Çetin et al. 2021). In the present study, chickpea cultivars were all classified as “oval” in shape. It was determined that the roundness and sphericity values decreased.
Similar to the current study, Nikoobin et al. (2009) reported the elongation value in horizontal orientation as 1.21 for 'Arman' cultivar and 1.22 for 'Hashem' cultivar. Elongation value in horizontal orientation was reported as 1.27 for both 'Arman' and 'Hashem' cultivars.
The highest L* value was obtained from Sezenbey (72.80) cultivar, and the lowest value was obtained from Cevdetbey (67.59) cultivar. The a* values varied between 6.23–8.18. The highest b* value was obtained from İzmir 92 (23.75) cultivar and the lowest values from Ilgaz (19.64) and Aksu (19.65) cultivars.
Jogihalli et al. (2017) reported L* and b* values of chickpea seeds respectively as 88.48 and 22.43, which were higher than the present values. Abou-Salem and Abou-Arab (2011) reported L* and b* values for chickpea seeds respectively as 82.22 and 19.69 for 'Giza' cultivar. Complying with the present findings, Güzel and Sayar (2012) reported L*, a*, and b* values of chickpea seeds respectively as 60.75, 6.77, and 16.14.
Correlations among physical properties of the cultivars
The correlation matrix between the physical properties of the chickpea cultivars is tabulated in Table 3. Seed compactness (in vertical orientation) had negative correlations with shape factor (in vertical orientation) (r =– -0.93). The projected area in vertical orientation had a positive correlation with equivalent diameter in vertical orientation (r = 0.90, roundness in the vertical orientation (r = 0.90), volume (r = 0.95) and compactness in the horizontal orientation (r = 0.95). Compactness in horizontal orientation also had a positive correlation with equivalent diameter in the vertical orientation (r = 0.99) and elongation in vertical orientation (r = 0.99). Besides, there was a positive correlation between shape index and elongation in horizontal orientation (r = 0.95). In addition, a* had the high correlation with b* (r = 0.97).
Similar to the present findings, Sastry et al. (2019) reported positive correlations of 100-seed weight with seed length (r = 0.901), width (r = 0.959) and thickness (r = 0.961); positive correlations of seed width with sphericity (r = 0.740) and shape index (r = 0.727) and positive correlations between sphericity and shape index (r = 0.958).
Hierarchical cluster analysis (HCA)
Hierarchical cluster analysis was used to cluster present cultivars based on physical properties and a dendrogram was generated based on the physical properties of chickpea cultivars (Fig. 4). The dendrogram-generated Euclidean similarity index was used in Unweighted pair group method with arithmetic mean (UPGMA) algorithm. Correlation coefficient (r) values for the first, second and combined years were calculated as 0.6858, 0.7280 and 0.8376, respectively. Three sub-clusters were formed under two main clusters in the dendrogram. There were two cultivars (Menemen and İzmir) in the sub-cluster of first main cluster. The second main cluster was divided into two sub-clusters. While two sub-clusters were divided into two sub-groups, the second sub-group included Cevdetbey, Sarı, Ilgaz and Çakır cultivars, which were similar in terms of physical properties. Aksu cultivar was separated from the other second sub-group and took place in the first group of the second sub-cluster of the third main cluster alone. In addition, Akça, Osmanbey and Gökçe cultivars were similar to each other and formed the first group of the second sub-cluster of the second main cluster. According to Euclidean distances, Akça – İnci and Damla – Işık cultivars were the closest cultivars in terms of physical characteristics.
Sastry et al. (2019) stated that three main groups were formed in hierarchical cluster analysis for the physical and hydration properties of chickpea seeds. The first group consisted of 26 cultivars largely represented by the cultivars, except for one desi cultivar; the second group consisted of 57 cultivars represented by desi type (30), kabuli type (19) and intermediate type (8) cultivars. The authors found that the third group, except for an intermediate cultivar, consisted of the desi type (128) cultivars. Cultivars in the first group were prominent for seven traits (100 seed weight, seed volume, surface area, sphericity, seed shape direction, hydration capacity, swelling capacity), cultivars in the second group for four traits (seed moisture content, seed bulk density, porosity, swelling index) and cultivars in the third group for three traits (seed coat content, seed true density, seed hydration index).
Guidoti et al. (2018) performed hierarchical clustering analysis (HCA) for morphological, agronomic, and molecular characteristics of commonly grown bean cultivars and indicated that 17 cultivars were clustered into two large groups with a difference value of 0.81. The first group was composed of 35.3% of the examined cultivars and the second group was composed of 64.7% of the cultivars.
Principal component analysis (PCA)
In the principal components analysis, the first two principal components (PC1: 50.37 and PC2: 22.80) accounted for 73.17% of total variation in physical properties. The first principal component included length, geometric mean diameter, volume, and surface area traits and the second principal component included roundness (in the horizontal orientation), thickness, elongation (in horizontal orientation) and sphericity traits.
A positive correlation was identified among geometric mean diameter, surface area and volume. In addition, a close relationship was found between the projected area (in the horizontal orientation) and the equivalent diameter (horizontal orientation). Ilgaz, Sarı and Çakır cultivars were determined to be ahead of the other cultivars in terms of geometric mean diameter, volume and projected area. Aksu and Cevdetbey were found to be prominent for geometric mean diameter and surface area (Fig. 5).
Sastry et al. (2019) conducted a principal component analysis of physical and hydration properties of chickpea seeds. The first PC (PC 1) explained 59.3% of the total variation, followed by PC 2 (14.8%) and PC 3 (7.0%) (three principal components together explained 81.05% of total variation). PC1 included seed weight, geometric mean diameter, hydration capacity, seed length, width, thickness, volume, sphericity, surface area, and seed swelling capacity, PC2 included seed bulk density, seed porosity, and swelling index and PC3 included only seed true density. It was determined that only two of 15 traits in the first three PCs (seed coat content in PC 1 and seed true density in PC 3) had a negative effect on variation.
Kibar et al. (2014) evaluated the factor coefficients for quality traits of dry beans grown under non-irrigated and drip-irrigated conditions with the use of PCA and indicated that the first three principal components explained 70% of the total variation under non-irrigated conditions and two principal components (PC1 and PC2) accounted for 100% of the total variation in physical and mechanical properties under drip-irrigated conditions. While the first principal component (PC1) explained 70.16% of the total variation, the second principal component (PC2) explained 29.84% of the total variation.
Prediction results of machine learning algorithms
Machine learning models were built according to data from 26 chickpea cultivars. The results were assessed with the statistical parameters of the seed mass prediction. Performance results of prediction using MLP, RF, SVR, and kNNs are presented in Table 4. The higher R and lower MAE, RMSE, RAE, and RRSE were indicated successful prediction.
The highest correlation coefficient values were found in RF and ML with the values of 0.8054 and 0.8043 for train-test split, and 0.8231 and 0.8142 for k-fold cross validation, respectively. Generally, algorithms use RMSE because of easy to perform mathematical operations. The lowest RMSE was also observed in RF as compared to the other algorithms for both validation methodologies. The lowest MAE, RMSE, RAE, and RRSE values determined from RF for train-test split methods as 0.0276, 0.0414, 51.01%, and 59.27%, respectively. For k-fold cross validation, RF also had the lowest MAE (0.0272), RMSE (0.0382), RAE (51.22%) and RRSE (56.78%) values. In this study, the lowest R values were found in 1-NN algorithms as 0.7271 and 0.7059 for train-test and k-fold methods, respectively. The greatest MAE, RMSE, RAE and RRSE values were determined in 1-NN algorithm for both validation methodologies. Among the k-NN algorithms, most successful “k” value was obtained as 5. The 5-NN algorithm had the greatest R with the values of 0.7974 and 0.8065, respectively. The 3-NN yielded similar R values with 5-NN for both validation methodologies.
Similar to the present study, Soares et al. (2013) reported the R2 values as between 0.69 and 0.91 for mass prediction of banana bunch using six different ANN structures. Rad et al. (2017) obtained R values of four different ANN structures as between 0.67 and 0.93 for eggplant mass prediction. Saglam and Cetin (2022) used MLP, kNN, RF, and GP to predict pistachio mass and stated Gaussian Processes (GP) had the lowest RMSE (0.038 for nut and 0.029 for kernel mass prediction) and the greatest R (0.976 for nut mass and 0.948 for kernel mass prediction). Gurbuz et al. (2018) obtained the greatest correlation coefficient with the value of 0.8603 for almond mass prediction by different data mining algorithms. Demir et al. (2020) used Modular Neural Network (MNN) and Radial Basis Neural Network (RBNN) structures to predict walnut mass from the physical attributes and reported similar RMSE values between 0.60 and 0.89 for MNN and as 0.0002 for RBNN. In contrast, Rad et al. (2015) indicated predicted mass of melon by MLP algorithms and R2 was lower than the present value. Overall, each base learner performed all statistical parameters well with all achieving an R value of > 0.7059. These results showed that all models had sufficient success in the mass prediction of seed.
The limitation of the study, because of the large number of images, affected the computer's processor in image processing and machine learning processes and this extended the processing time. Another limitation, darker colors were not preferred for the background while the images were taken. This situation caused the segmentation of images to be difficult. It is recommended that researchers pay attention to these two factors in future studies.
Conclusion
In this study, the physical properties of 26 different chickpea cultivars grown for two years under the same ecological conditions without using any chemical inputs were determined by image processing technique, and relationships between these traits were evaluated. Highly positive correlations were seen between projected area (in vertical orientation) and equivalent diameter (in vertical orientation), roundness (in vertical orientation), volume, and compactness (in horizontal orientation). Besides, there was a high correlation between a* and b*. In PCA analysis, the PC1 and PC2 explained 73.17% of the total variation. The first principal component included length, geometric mean diameter, volume, and surface area, the second principal component included roundness (in the horizontal orientation), thickness, elongation (in the horizontal orientation), and sphericity. According to Euclidean distances, Akça – İnci and Damla – Işık cultivars were the closest cultivars in terms of physical characteristics. The Random Forest and Multilayer Perceptron yielded better outcomes as compared to the other machine learning algorithms. Present findings may contribute significant data for quality grading, selection, and breeding studies, and the food industry. Computer and machine vision techniques offered an accurate and rapid classification of chickpea cultivars and may offer effective and practical tools for agricultural machinery designers and seed companies.
In recent years, technological developments in the seed industry have significantly improved the quality and functional requirements of seeds. In this study, important knowledge is presented for both the seed industry. In future studies, the use of different algorithms, attribute selection methods, and species will contribute to the literature and optimization of systems. Therefore, it is thought that the use of deep learning techniques together with machine learning for these systems will increase efficiency.
References
Abbaspour-Gilandeh Y, Azizi A (2018) Identification of cracks in eggs shell using computer vision and hough transform Yuzuncu Yıl Univ. J Agric Sci 28:375–383
Abu-Salem FM, Abou-Arab EA (2011) Physico-chemical properties of tempeh produced from chickpea seeds. J Am Sci 7:107–118
Bacci L, Rapi B, Novaro P (2002) Durum wheat quality evaluation software. In World Congress of Computers in Agriculture and Natural Resources, Proceedings of the 2002 Conference (p. 49). American Society of Agricultural and Biological Engineers.
Durum wheat quality evaluation software world congress of computers in agriculture and natural resources Proceedings of the 2002 Conference
Cetin N, Yaman M, Karaman K, Demir B (2020) Determination of some physicomechanical and biochemical parameters of hazelnut (Corylus avellana L) cultivars. Turkish J Agric Forest 44(3):1–12
Chavan JK, Kadam SS, Salunkhe DK, Beuchat LR (1987) Biochemistry and technology of chickpea (Cicer arietinum L) seeds. Crit Rev Food Sci Nutr 25(2):107–158
Colton T (1974) Statistics in medicine. Little Brown and Company
Çetin N, Karaman K, Beyzi E, Sağlam C, Demirel B (2021) Comparative evaluation of some quality characteristics of sunfower oilseeds (Helianthus annuus L) through machine learning classifiers. Food Anal Methods 14:1666–1681
Çetin N (2022) Machine learning for varietal binary classification of soybean (Glycine max (l) merrill) seeds based on shape and size attributes. Food Anal Methods 15:2260–2273
de Camargo AC, Favero BT, Morzelle MC, Franchin M, Alvarez-Parrilla E, de la Rosa LA (2019) Is chickpea a potential substitute for soybean? Phenolic bioactives and potential health benefits. Int J Molec Sci 20:26–44
Demir B, Eski İ, Kuş ZA, Ercişli S (2017) Prediction of physical parameters of pumpkin seeds using neural network. Notulae Botan Horti Agrobot Cluj-Napoca 45(1):22–27
Demir B, Sayıncı B, Çetin N, Yaman M, Çömlek R, Aydın Y, Sütyemez M (2018) Elliptic Fourier based analysis and multivariate approaches for size and shape distinctions of walnut (Juglans regia L) cultivars. Grasas Aceites 69(4):e271
Demir B, Eski İ, Gürbüz F, Kuş ZA, Sesli Y, Ercişli S (2020) Prediction of walnut mass based on physical attributes by Artificial Neural Network (ANN). Erwerbs-Obstbau 62(1):47–56
Düzgüneş O, Kesici T, Kavuncu O, Gürbüz F (1987) Araştırma ve Deneme Metodları (İstatistik Metodları II). Ankara Üniv Ziraat Fak Yayınları: 1021 Ders Kitabı, 295 Ankara (In Turkish)
Eissa AHA, Mohamed MA, Moustafa H, Alghannam ARO (2010) Moisture dependent physical and mechanical properties of chickpea seeds. Int J Agric Biol Eng 3:80–93
Fathizadeh Z, Aboonajmi M, Hassan-Beygi SR (2021) Nondestructive methods for determining the firmness of apple fruit flesh. Inf Processing Agric 8:515–527
Food and Agriculture Organization of the United Nations (FAO) (2019) http://wwwfaoorg/faostat/en/#data/QC Accessed date: 06012022
George C, McGruder R, Torgerson K (2007) Determination of optimal surface area to volume ratio for thin-layer drying of breadfruit (Artocarpus altilis). Int J Serv Learn Eng 2:76–88
Gonzalez JPB, Ortiz F, A P Rodriguez CPP, (2017) Mass and volume estimation of passion fruit using digital images. IEEE Latin America Trans 15(2):275–282
Guidoti DT, Gonela A, Vidigal MCG, Conrado TV, Romani I (2018) Interrelationship between morphological agronomic and molecular characteristics in the analysis of common bean genetic diversity. Acta Scient Agron 40
Gürbüz F, Demi̇r B, Eski I, Kuş ZA, Yılmaz KU, İlikçioğlu E, Ercişli S (2018) Estimation of the weights of almond nuts based on physical properties through data mining. Notulae BotanHorti Agrobot Cluj-Napoca 46(2):579–584
Güzel D, Sayar S (2012) Effect of cooking methods on selected physicochemical and nutritional properties of barlotto bean chickpea faba bean and white kidney bean. J Food Sci Tech 49:89–95
Iqbal A, Ateeq N, Khalil IA, Perveen S, Saleemullah S (2006) Physicochemical characteristics and amino acid profile of chickpea cultivars grown in Pakistan. J Foodserv 17:94–101
Jogihalli P, Singh L, Kumar K, Sharanagat VS (2017) Physico-functional and antioxidant properties of sand-roasted chickpea (Cicer arietinum). Food Chem 237:1124–1132
Kara M, Sayinci B, Elkoca E, Öztürk İ, Özmen TB (2013) Seed size and shape analysis of registered common bean (Phaseolus vulgaris L) cultivars in Turkey using digital photography. J Agric Sci 19:219–234
Kibar H, Öztürk T, Temizel KE (2014) Effective engineering properties in the design of storage structures of postharvest dry bean grain. Acta Scient Agron 36:147–158
Kus ZA, Demir B, Eski I, Gurbuz F, Ercisli S (2017) Estimation of the colour properties of apples varieties using neural network. Erwerbs-Obstbau 59(4):291–299
Masoumi AA, Tabil L (2003) Physical properties of chickpea (C arietinum) cultivars. In 2003 ASAE Annual Meeting (p 1) American Soc Agric Biol Eng 1–14
McGuire RG (1992) Reporting of objective colour measurements. HortScience 27(12):1254–1255
Nikoobin M, Mirdavardoost F, Kashaninejad M, Soltani A (2009) Moisture-dependent physical properties of chickpea seeds. J Food Proc Eng 32:544–564
Omid M, Khojastehnazhand M, Tabatabaeefar A (2010) Estimating volume and mass of citrus fruits by image processing technique. J Food Eng 100(2):315–321
Ozturk I, Kara M (2009) Physico-chemical grain properties of new common bean cv‘Elkoca-05.’ Sci Res Essays 4:88–93
Parker JR (2001) Rank and response combination from confusion matrix data. Inform Fus 2(2):113–120
Rad MRN, Fanaei HR, Rad MRP (2015) Application of Artificial Neural Networks to predict the final fruit weight and random forest to select important variables in native population of melon (Cucumis melo L). Sci Hortic 181:108–112
Moosavi AA, Sepaskhah A (2012) Artificial neural networks for predicting unsaturated soil hydraulic characteristics at different applied tensions. Arch Agron Soil Sci 58(2):125–153
Naroui Rad MR, Ghalandarzehi A, Koohpaygani JA (2017) Predicting eggplant individual fruit weight using an artificial neural network. Int J Veg Sci 23(4):331–339
Ozaktan H (2021) Technological characteristics of chickpea (Cicer arietinum L.) cultivars grown under natural conditions. Turkish J Field Crops 26(2):235–243
Ropelewska E, Jankowski KJ (2019) Classification of the seeds of traditional and double-low cultivars of white mustard based on texture features. J Food Process Eng 42:13077
Ropelewska E (2020) The use of seed texture features for discriminating different cultivars of stored apples. J Stored Prod Res 88:101668
Ropelewska E, Sabanci K, Aslan MF, Azizi A (2022) A novel approach to the authentication of apricot seed cultivars using innovative models based on image texture parameters. Horticulturae 8:431
Sabanci K, Aslan MF, Ropelewska E, Unlersen MF (2021) A convolutional neural network-based comparative study for pepper seed classification: analysis of selected deep features with support vector machine. J Food Process Eng 25:13955
Saglam C, Cetin N (2022) Prediction of pistachio (Pistacia vera L) mass based on shape and size attributes by using machine learning algorithms. Food AnalMethods 15(3):739–750
Sastry DVSSR, Upadhyaya HD, Srinivas TR (2019) Variation for seed physical and hydration properties of chickpea (Cicer arietinum L) mini core collection and their relevance to conservation and utilization. Plant Gen Resour 17:311–324
Sayinci B, Ercisli S, Ozturk I, Eryilmaz Z, Demir B (2012) Determination of size and shape in the ‘Moro’ blood orange and ‘Valencia’ sweet orange cultivar and its mutants using image processing. Notulae Botan Horti Agrobot Cluj-Napoca 40:234–242
Sayıncı B, Kara M, Ercişli S, Duyar Ö, Ertürk Y (2015) Elliptic Fourier analysis for shape distinction of Turkish hazelnut cultivars. Erwerbs-Obstbau 57:1–11
Shabani A, Ghaffary KA, Sepaskhah AR, Kamgar-Haghighi AA (2017) Using the artificial neural network to estimate leaf area. Sci Hortic 216:103–110
Simonyan KJ, Yiljep YD, Oyatoyan OB, Bawa GS (2009) Effects of moisture content on some physical properties of Lablab purpureus (L) sweet seeds. Agric Eng Int CIGR Journal 11:1–13
Soares JDR, Pasqual M, Lacerda WS, Silva SO, Donato SLR (2013) Utilization of artificial neural networks in the prediction of the bunches’ weight in banana plants. Sci Hortic 155:24–29
Symons SJ, Van Schepdael L, Dexter JE (2003) Measurement of hard vitreous kernels in Durum wheat by machine vision. Cereal Chem 80:511–517
Varol IS, Yusuf Murat K, Irik HA, Kirnak H, Kaplan M (2020) Supplementary irrigations at different physiological growth stages of chickpea (Cicer arietinum L.) change grain nutritional composition. Food Chem 303:125402
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Press, San Francisco
Funding
This study was supported by Turkish Scientific Research Council (TUBITAK) with the project number of 119O226.
Author information
Authors and Affiliations
Contributions
All the authors have accepted responsibility for the content of the entire manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors are unaware of any conflict of interest or possible competition.
Consent to participate
Attached emails.
Consent for publication
Attached emails.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Çetin, N., Ozaktan, H., Uzun, S. et al. Machine learning based mass prediction and discrimination of chickpea (Cicer arietinum L.) cultivars. Euphytica 219, 20 (2023). https://doi.org/10.1007/s10681-022-03150-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10681-022-03150-5