Introduction

Coffee is one of the most consumed beverages in the world, with wide popularity in Europe and the USA. Its consumption is stimulated by its pleasant flavor and aroma, as well as by its physiological effects. The beverage has a high content of antioxidants, such as polyphenols, which are influenced by several factors, such as plant species and degree of grain roasting (Vignoli et al. 2011, 2014; Heeger et al. 2017; Araújo et al. 2020).

The fruit of the coffee tree consists of a more external region, called pericarp which, during ripening, changes its color. Below the pericarp is the pulp, which is followed by a thin, viscous layer, known as mucilage. After the mucilage, we find the endocarp (parchment), and finally, there is the silver skin that covers the entire coffee bean, which is also called the endosperm (Esquivel and Jiménez 2012). Among the various elements that make up coffee beans are minerals. These are inorganic compounds that have a high correlation with the type of soil and forms of coffee cultivation (Semen et al. 2017) and perform essential functions for maintaining the physiological processes of the plant (Poltronieri et al. 2011).

Electrical conductivity and potassium leaching tests are highly related to the integrity of cell membranes (Vanzolini and Nakagawa 2003). Thus, the deterioration of grain membranes, such as those of coffee, can characterize the beginning of the degradation of product quality (Oliveira et al. 2015a). According to Marcos Filho (2015), the electrical conductivity test is based on the fact that membranes that present some type of deterioration have a lower rate of cell membrane repair when the seeds are exposed to the aqueous medium, causing the loss of compounds to the medium. Among these compounds are some ions such as potassium, calcium, magnesium, sodium, and manganese. While in the electrical conductivity, the total ions in solution are quantified; in the potassium leaching test, only the potassium element is measured.

Some studies have linked the integrity of the membrane to the quality of the coffee. Malta et al. (2005) reported higher values of electrical conductivity and potassium leaching for smaller coffee beans, as well as for defective bean classes, especially black and sour beans. Goulart et al. (2007) found that coffee classified as inferior quality drinks showed higher degradation of the cell wall and, consequently, higher electrical conductivity and potassium leaching values in relation to better quality beverages.

NIR reflectance spectroscopy can be used to determine the concentrations of macro- and microelements in different matrices, whether they are of food origin or not. However, the determination of compounds of inorganic origin by means of spectroscopic techniques is difficult, as there is no direct relationship between these compounds and the spectra of the samples (Cozzolino et al. 2008). Even so, several studies have applied NIR to evaluate these elements and found promising results, as reported by Menesatti et al. (2010) when analyzing the content of several minerals in orange leaves and Wu et al. (2009) who evaluated iron and zinc in powdered milk.

In view of the lack of studies that evaluate the physiological quality of coffee beans by means of near-infrared spectroscopy, this study sought to correlate spectral data from coffee seeds with the electrical conductivity and potassium leaching values obtained by standardized methods and evaluate the applicability of this technique to carry out these determinations.

Material and Methods

Acquisition of Samples and Recording of NIR Spectra

Coffee samples (Coffea arabica L.) of varied varieties totaling two hundred and fifty samples were considered in the present study. The samples were taken from the harvest season 2016/2017 in different cities in the state of Espírito Santo (Fig. 1), located in the Southeast region of Brazil. Each sample contained about 500 g of green coffee and was provided by Instituto Capixaba de Pesquisa, Assistência Técnica e Extensão Rural (INCAPER).

Fig. 1
figure 1

Cities where coffee samples were collected

All two hundred and fifty samples of green coffee were analyzed by NIR spectroscopy with Fourier transform in an AgriQuant spectrophotometer (Q-Interline A/S, Stengårdsvej, Denmark) in reflection mode. An internal background was used as a reference. The samples were homogenized manually before the collection of subsamples for the replicates that were measured consecutively. Spectra collection was performed for whole grains, without any initial pretreatment. Data were recorded between 4000 and 10000 cm−1, at intervals of 8 cm−1 and 64 scans, in a climatized room (18 °C, 65% RH), in triplicate, and the mean spectrum was then calculated. During the collection of the spectra, the samples were kept in constant rotation. The collection instrument was configured with InfraQuant 3.0 software (Q-Interline, Stengårdsvej, Denmark).

Determination of Electrical Conductivity and Potassium Leaching

In the determination of electrical conductivity, an adaptation of the methodology proposed by Krzyzanowski et al. (1991) was used. Fifty coffee beans were used, whose masses were measured in analytical balance. Then, the samples were immersed in 75 mL of deionized water for 5 hours. After this time, the transfer of the solutions to another container was conducted, where the reading of the electrical conductivity was held in conductivity meter (Digimed DM-32). Then, the electrical conductivity value was expressed in microsiemens per centimeter per gram.

Potassium leaching was determined from the same solution used to determine the electrical conductivity values. In this case, the reading was performed on a flame photometer (Digimed DM-62) (Prete 1992). The final value of leached potassium was given by Eq. (1).

$$ \mathrm{LK}=\frac{\mathrm{R}\ast \mathrm{D}\ast 1.56}{\mathrm{W}} $$
(1)

where LK is leached potassium; R is the value expressed by the flame photometer; D is the sample dilution; and W is the weight of the sample (g).

All analyses were performed in triplicate.

Elaboration of PLS Models

In the regression of partial least squares (PLS), a dependent variable is described as a function of some variables, called latent variables. These are created from the original data through linear combinations that capture the maximum covariance between independent (absorbance) and dependent (chemical analysis) variables (Goodarzi et al. 2015). For the PLS regression, spectroscopic and property data of interest (potassium leaching or electrical conductivity) were used simultaneously. Initially, the total sample set (two hundred and fifty) was manually divided into two homogeneous subsets, one of them containing 70% and another 30% of samples. The set containing the largest number of samples was called calibration and the set with the lowest number of samples was called the prediction. The calibration set was used to build the predictive model, while the prediction set was used to test the calibration model on external samples and validate the results. The cross-validation (leave-one-out) in the calibration set was the process used to determine the number of latent variables in each of the calibration models, based on the minimization of the standard error and the highest correlation coefficient in the cross-validation, avoiding overfitting of the model. The models were validated by full cross-validation and external set validation. The cross-validation (leave-one-out) consists of using a single sample from the calibration set as validation data and the others as training data. In this way, each sample in the set is used once as a validation data and the process ends when all the samples had been used (Xie et al. 2007; Frizon et al. 2015).

In addition to the models elaborated from the spectra originally obtained, some spectral data pretreatment methods were also evaluated, being the multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, and second derivative. Thus, for each response (electrical conductivity and potassium leaching), five models were generated and compared. The calibration models developed had their prediction ability assessed by calculating the prediction errors for the calibration (RMSECV) and prediction (RMSEP) sets, correlation (R) between predicted and experimental values in cross-validation (Rcv) and prediction (Rp), ratio of prediction to deviation values (RPD), slope, and bias. The Chemoface v. 1.61 software was used for data pretreatment and elaboration of regression models (Nunes et al. 2012).

Results and Discussion

NIR Signatures of Coffee Beans

Figure 2 shows the original spectra (Fig. 2a) and that after, applying the pretreatments for all green coffee samples was used. The spectral profile observed is due to overlapping absorptions that correspond to overtones and combinations of vibratory modes resulting from the chemical bonds between the C–H, N–H, O–H, and S–H groups. The intensities of these bands are determined by the change in the molecule dipole moment. Thus, the hydrogen atom has the lowest molecular mass; it has the highest vibrations and deviations from harmonic behavior, resulting in the presence of this element in the main bands of the near infrared (Blanco and Villarroya 2002; Manley 2014).

Fig. 2
figure 2

NIR absorbance spectra of green coffee (a), after MSC preprocessing (b), after SNV preprocessing (c), after first derivative preprocessing (d), and after second derivative preprocessing (e)

The spectra present the samples as having a similar spectral pattern. The wavenumbers around 5986 and 4643 cm−1 have their absorptions attributed to caffeine, which has C–H bonds in the molecule. At 5623 and 4443 cm−1, the bands are associated with the C–H and O–H groups of cellulose, between 8600 and 8200, and between 6000 and 5700 cm−1, the absorptions are attributed to CH bonds of fatty acids, lignin, and amino acids, and in 5129 cm−1 carboxylic acids and esters (C=O) (Buratti et al. 2014). According to Santos et al. (2016), water absorption bands are typically found between 5000 and 5200 cm−1 as well as between 7200 and 6800 cm−1, which was also reported by Alessandrini et al. (2008). Sucrose and other carbohydrates show absorption peaks between 8210 and 8051 cm−1 and 4336 and 4325 cm−1; between 6793 and 6765 cm−1, the absorption of phenolic compounds and chlorogenic acids occurs; and in regions between 5861 and 5834, 4105 and 4040, and 4032 and 4019 cm−1, the absorptions are attributed to the lipid compounds (Ribeiro et al. 2011).

During NIR analysis of solid samples, such as coffee, undesirable variations can occur, mainly due to light scattering effects and differences in the length of the optical path, causing variations in the baseline and effects of non-linearity (Rinnan et al. 2009). As such, preprocessing is used to correct spectral data, being a fundamental step to build simpler and more robust regression models (Barbin et al. 2014). As it is difficult to predict which preprocessing method is most suitable for the data under study, it is common for several methods to be tested and evaluated. The multiplicative scatter correction (MSC) and the standard normal variate (SNV) (Fig. 2b and c) are methods applied in the correction of spectral distortions, caused mainly by particles of different sizes, which result in scattering and slope of the baseline (Manley 2014). In addition, the Savitzky-Golay derivation, with a 15-point window and a second-degree polynomial, was also used (Fig. 2d and e). According to Rinnan (2014), the derivative application technique is commonly used and the window size has a direct effect on the results obtained. Overall, the first and second derivatives increase spectral resolution and correct the baseline (Casale and Simonetti 2014).

Calibration Models for Electrical Conductivity

The mean values, standard deviation, and range of the electrical conductivity data in the calibration and prediction sets are shown in Table 1. The coffee samples showed an electrical conductivity value ranging from 104.09 and 193.65 μS/cm/g and standard error of 1.19 μS/cm/g. These values are similar to those reported by Favarin et al. (2004), who evaluated the electrical conductivity of arabica coffee submitted to different post-harvest managements, obtaining values between 123.10 and 205.74 μS/cm/g. Borém et al. (2008) found values from 85.67 to 230.00 μS/cm/g for coffee subjected to different processing and drying methods. Higher electrical conductivity values can be an indication that cell membranes are more disorganized and permeable, this process being one of the first to occur in the deterioration of coffee beans, causing loss of quality (Oliveira et al. 2015a).

Table 1 Reference values for electrical conductivity and potassium leaching in the calibration and prediction sets

Table 2 shows the number of latent variables, the values of RMSECV, RMSEP, Rcv, Rp, Bias, slope, and RPD. According to Table 2, although different preprocessing results in models with similar predictive capabilities, the best calibration model, that is, that with the highest correlation value and the least prediction error, was obtained with the application of the second derivative. This mathematical pretreatment is used to improve the distinction between overlapping peaks and correct the baseline (Cozzolino et al. 2008). The second derivative resulted in a model with Rcv, Rp, RMSECV, and RMSEP of 0.97, 0.94, 4.86, and 7.94, respectively. In addition, this model had the highest RPD value (2.41). According to Moscetti et al. (2019), models whose RPD values are greater than or equal to 2 can be considered excellent.

Table 2 Results of PLS regression models for both non-pretreated and pretreated spectra

Also, according to Table 2, although the prediction error increases for the test set, the statistics related to the prediction set did not differ much from the values found for the complete cross-validation set. The association between the electrical conductivity values determined in the laboratory and predicted by the best PLS model is shown in Fig. 3a and b. For this model, the average percentage errors were 2.80 and 3.60%, for the calibration and prediction sets, respectively.

Fig. 3
figure 3

Reference values versus predicted values of electric conductivity in calibration set (a) and prediction set (b)

Although the NIR predicts properties that show absorption in this region of the spectrum, other applications that involve calibrating models to predict the physical properties of samples and components that do not absorb in the NIR region are also possible. For example, particle size can be determined using NIR based on dispersion properties of particles with different sizes. On the other hand, the concentration of some component that does not absorb in the NIR region can be determined through the analysis of covariation of the non-organic component in relation to the organic component of the sample (Blanco and Villarroya 2002; Manley 2014).

Wang et al. (2014) evaluated the quality of commercial organic fertilizers and used near-infrared spectroscopy and PLS regression to predict the electrical conductivity of 104 samples. The results found demonstrated that the technique was efficient for predicting electrical conductivity with correlation values of 0.99 and 0.74 for the calibration and validation sets, respectively. Todorova et al. (2011) evaluated the electrical conductivity of soil samples and obtained correlation values of 0.89 and 0.74 for the calibration and validation sets. On the other hand, Cozzolino et al. (2011) investigated the use of NIR to predict the electrical conductivity value in grapes after the homogenization of the berries, finding an R2 value of 0.77 in the validation set. However, these authors highlight the difficulty in assigning a given wavelength to their respective molecular absorptions, especially when it comes to ions in solution that can interfere in the spectra, especially due to changes in hydrogen bonding.

Calibration Models for Potassium Leaching

Table 1 shows the mean, standard deviation, and range of reference values for potassium leaching for the coffee samples used in the calibration and prediction sets. The experimental values obtained for potassium leaching were between 40.41 and 64.92 ppm and standard error of 0.36. Nobre et al. (2011) evaluated the leaching of potassium from immature Arabica coffee beans, finding values between 32.7 and 69.0 ppm. On the other hand, Borém et al. (2008) found values between 23.33 and 68.33 ppm in arabica coffee submitted to different processing and drying methods. According to (Oliveira et al. 2015b), wide variations in the potassium content of coffee are expected since the mineral content is influenced by soil type and also by the cultivation conditions. In addition, coffee plants that are zinc deficient are more likely to damage the cell membrane and consequently lose more potassium and amino acids (Poltronieri et al. 2011).

As noted in Table 2, the use of the standard normal variate (SNV) preprocessing method provided a calibration model with better predictive capacity for potassium leaching values, which presented a higher correlation value in the calibration (Rcv = 0.88) and in the prediction set (Rp = 0.80), lower prediction error (RMSECV = 2.68 and RMSEP = 3.22) and higher RPD value (1.68). For this preprocessing, the average percentage errors found for the calibration and prediction sets were 5.20 and 6.07%, respectively. The SNV is applied to individual spectral data to reduce dispersion effects (Cozzolino et al. 2008). The number of latent variables was determined so that the maximum correlation value between predicted and experimental values was obtained, at this point, the RMSECV value is minimal.

When obtained, the spectra can present much noise. In this way, the application of mathematical pretreatments can reduce interferences and improve the chemical signal of the compounds to be analyzed (Tahir et al. 2017). As performed in this work, other studies were developed from preprocessed NIR spectra, showing good prediction results. The SNV, MSC, first derivative, and second derivative techniques have been verified in several studies, such as in determining the content of ash and lipids in roasted coffee (Pizarro et al. 2004) and protein classification and determination in powdered milk (Inácio et al. 2011). Other such studies deal with the determination of phenolic compounds and antioxidant activity in honey (Tahir et al. 2017), prediction of a roasting degree in coffee (Alessandrini et al. 2008), and discrimination between processing and species of coffee (Buratti et al. 2014).

Figure 4a and b show the experimental and predicted values for potassium leaching for the best calibration model. The inorganic nature of minerals seems to be a limiting factor in the use of the NIR technique. However, it is believed that mineral prediction using this technique is possible due to the association of these inorganic elements with organic elements of food (Givens and Deaville 1999; Büning-Pfaue 2003; Manley 2014). Cozzolino et al. (2011) used NIR to determine the content of several minerals in grapes. For the potassium element, these authors obtained R2 values equal to 0.78 and 0.45 for the cross-validation and prediction sets, respectively. Better results were obtained by Lucas et al. (2008) who determined the potassium content in fresh cheese and found an R2 value of 0.79 and 0.78 for the cross-validation and external validation sets, respectively.

Fig. 4
figure 4

Reference values versus predicted values of potassium leaching in calibration set (a) and prediction set (b)

Conclusions

The results obtained in the calibration and prediction steps demonstrated that the PLS models developed to evaluate the electrical conductivity of the coffee grains had better prediction results when compared with the models for estimating the potassium leaching values. The pretreatments applied to the near-infrared spectra did not substantially improve the performance of the calibration models. The models for estimating these grain properties can be used as a tool to indicate the stability of stored grains. In addition, the results indicate that infrared can be applied to assess the mineral content in coffee so that further studies with coffee from different crops and regions of the world can be carried out and the results of the different mineral content can be used to indicate the geographical origin of these samples.