Introduction

Coffee is a prominent world economic product due to the high consumption. According to the ICO (International Coffee Organization), the crop of 2014/2015 reached about 142,000,000 bags of 60 kg. During this period, Brazil remained as the first on the list of producers and exporters of coffee with 43,200,000 bags produced, with approximately 84% intended for export (ICO 2014). Despite the great movement of capital raised in the marketing of this product, the coffee has been portrayed in literature as the target of frequent tampering with additions of lower-cost products such as the shells of coffee husks, roasted corn, wood twigs or sticks, and soil, among others (Pauli et al. 2014; Reis et al. 2013a, b; Sano et al. 2003).

The ease of adulteration lies in the fact that the products added when roasted and ground, in special conditions, are confused with the original dust. Furthermore, in Brazil, there is a standard technique for detecting adulterants in coffee that is not subjective. According to the norms of ABIC (Brazilian Coffee Industry Association), created to ensure the quality of the coffee, they are allowed up to 1% of foreign material in the ground roasted coffee and this investigation has been accomplished through techniques such as optical microscopy, chromatography, and infrared spectroscopy (ABIC 2013).

The near-infrared hyperspectral imaging is a promising technique which has been widely used for drug analysis and forensic chemistry in general, with advantage especially in the studies of heterogeneous samples due to the ability to provide chemical information relevant to each pixel in the mapped area, even in lower concentrations, without sample preparation (Amigo et al. 2008; Fu et al. 2014; Lim et al. 2016; Silva et al. 2014; Piqueras et al. 2012). This technique consists in obtaining a three-dimensional data array, also known as hypercube, containing two spatial information (from the selected are to be mapped) and one spectral (wavenumber). The spectrum in each pixel of the image contains the chemical information about the species present and this signal is not proportional to the whole concentration of the analyte on the mapped surface, but its abundance in the particular sampling point (Sun 2010; Gowen et al. 2007; Kim et al. 2011).

The massive quantity of data generated using hyperspectral imaging that depends of the number of scans, the size of the pixels, and the selected area, often prevents the use of traditional methods of processing and the use of chemometric tools becomes imperative.

The application of chemometric tools involves distinct situations, depending of the goals to be reached, as classification, multivariate regression, or multivariate curve resolution. Multivariate curve resolution methods present the advantage of spectra recovered of the pure species participating in the composition of the sample without prior knowledge of the same (Parasta and Tauler 2014; de Juan and Tauler 2006; de Juan et al. 2004). These tools provide the development of a mathematical model able to correlate the spectra of the reference species participating in the mixture and their concentration, making possible the detection and quantification in an efficiently way (Rutan et al. 2009).

The principle of the MCR-ALS (multivariate curve resolution with alternating least squares) (Jaumont et al. 2005) is to resolve, subject to certain constraints, the concentration matrix C and the pure spectra matrix S. The array of initial data D is decomposed according to Eq. 1, where E is the error matrix relating to residual variation of the data.

$$ \mathbf{D}={\mathbf{CS}}^{\mathrm{T}}+\mathbf{E} $$
(1)

The main condition for a suitable resolution is to begin the process with the correct number of chemical components of the mixture that, in case of first-order data, can be estimated by the percentage of variance explained by the singular value decomposition (SVD) or principal component analysis (PCA) (de Juan and Tauler 2006). In order to obtain more appropriate models, some constraints are selected regarding non-negativity of concentration and/or spectra, unimodality, mass closure, etc. (de Juan et al. 2009; de Juan et al. 2014; Tauler et al. 2009).

The correlation constraint is an advanced tool available in MCR-ALS and it implies a concentration correlation information for each constituent of the samples, established by the model compared to the reference values (de Juan et al. 2008). In this case, an augmented matrix (DAUG) is built, including a set of calibration samples along with the initial array (Alexandrino et al. 2015). By using this constraint, the samples are forced to correlate with the concentration of values previously known, establishing a linear model. In hyperspectral imaging data, the concentration values obtained by the model for each pixel are used for creation of maps of distribution of concentration of the chemical species (Antunes et al. 2002; Jaumot et al. 2013).

In this work, we performed the detection and quantification of main coffee adulterants, such as coffee husks, roasted and powdered corn kernels, wood sticks, and soil, based on the combination of near-infrared hyperspectral imaging and multivariate curve resolution with alternating least squares (MCR-ALS). The detection was accomplished comparing the reference spectra of each adulterant with the spectra recovered by the MCR model using only the adulterated samples, and the quantification was based on the use of the MCR correlation constraint.

Another proposal for this work was to ensure the quality of the roasted and ground coffee through the development of a control chart associated with the results of the MCR-ALS applied to hyperspectral imaging data. In this procedure, it is possible to check the purity of the coffee quickly and efficiently, as well as identify samples adulterated with coffee husks, roasted and powdered corn kernels, soil, and wood sticks, simultaneously, in concentrations above 1%. Although the term quality can present numerous definitions, according to experts, there is a consensus that it refers to the needs and expectations of consumers in relation to the product and are also associated with indirect factors such as price and presentation of the packaging, among others (Does et al. 1997; Caplen 1988). To better monitor the attributes related to the quality of a product, it is possible to establish a statistical control, assessing the possible variations in relation to the known standards (Dale 1994; Miller 1998). In this case, a chart control was developed, based on a center line and two control lines (Linna and Woodall 2001). Initially, it was defined a set of purity samples of certified coffees to be used for control. From MCR-ALS scores (C matrix) for each control sample, it was calculated the average of the values obtained related to coffee constituent. The acceptance of samples within the quality profile of purity was established by the lines of upper limits and lower limit corresponding to two times the standard deviation of the average of these score values.

Materials and Methods

Samples

In this work, Robusta coffee beans without adulterants certified by producers were used as reference samples. Corn kernels were purchased at local stores, while soil, coffee husks, and wood sticks were obtained in the region’s crops. To make the adulterants with the same coffee appearance, they were placed separately in aluminum trays and heated in oven at 250 ° C for 30 min, in a similar procedure already described in literature (Schmidt et al. 2008). Finally, the adulterants were ground up and added to the ground roasted coffee in concentrations that ranged from 1.0 to 40.0% (w/w). The mixtures were stirred in vortex and pressed at 6 tons for making tablets, ensuring a smooth surface.

Equipment

The spectra were obtained using a Spotlight 400 N FT-NIR Imaging System from Perkin-Elmer. The tablets were placed on the microscopy support, and with the help of a joystick, the focus was adjusted for each sample. The conditions for the analysis were 32 scans in the range of 4000 to 7800 cm−1, pixels with 25 μm2, and a monitored area of 4 mm2, resulting in 160 × 160 pixels or 25,600 spectra per sample.

The data were analyzed using the MCR Toolbox 2.0 for Matlab, freely available at https://mcrals.wordpress.com/download/mcr-als-2-0-toolbox.

Results and Discussion

Data Pre-processing

The first step in the development of a multivariate model is the appropriate choice of the pre-processing steps. In this work, the near-infrared spectra obtained in reflectance mode (R) were converted to pseudo-absorbance by a logarithmic function (log 1/R), and subsequently was applied the multiplicative scatter correction (MSC) to reduce the scattering effect, as shown in Fig. 1.

Fig. 1
figure 1

Spectral profile of (a) a sample of the coffee adulterated and (b) after the pre-processing by MSC

Detection and Quantification of the Adulterants by MCR-ALS

In a first step, a model was generated for each set of samples containing the coffee plus one adulterant, and the simplisma was used to obtain the coffee and adulterant spectra. In this situation, from the recovered spectrum by MCR, the adulterant can be identified, and the quantification can be performed using the calibration model generated by MCR-ALS with correlation constraint for this adulterant. This procedure is important to ensure the reliability that the generated concentration maps correspond to the species of interest (coffee and adulterant). The spectra recovered by the MCR-ALS algorithm using the simplisma for each adulterant, as well as the reference spectra of this adulterant, are presented in the Fig. 2. For each recovered spectrum by the MCR-ALS, the coefficient of correlation was determined in terms of similarity to the spectrum of the reference species. For the adulterants studied, these values ranged from 0.90 to 0.99. These values indicate that it was possible to identify all adulterants with great reliability, showing the potentiality of the proposed methodology.

Fig. 2
figure 2

Recovered spectra by Simplisma. (a) Coffee husks. (b) Roasted and powdered corn kernels. (c) Wood sticks. (d) Soil

The number of factors used in the MCR calculations was determined by SVD analysis, and it was equal to two for all samples.

In order to perform the quantification using the correlation constraint in the MCR-ALS algorithm, 5 samples were used for calibration at concentrations of 5.0%, 10.0%, 20.0%, 30.0%, and 40.0 % (w/w), for each adulterant (coffee husks, roasted and powdered corn kernels, wood sticks, and soil) added to the coffee. In this situation, the matrix to be decomposed by the MCR-ALS was formed by the mean spectra of the calibration samples (used in the calibration step), together with the spectra of each pixel of the image. The concentration values of the calibration samples are fixed by the algorithm that sets these values in the concentration matrix.

Subsequently, from the MCR-ALS results for each adulterant, it was built the distribution concentration maps of the constituents of each sample and the corresponding histograms of adulterant concentration. In Fig. 3, it is presented, as an example, the distribution concentration maps for coffee adulterated with husks. In this figure, it is possible to verify in the concentration maps a complementarity of the images as to the regions of presence and/or absence of adulterants. The heterogeneous distribution of the adulterant visualized in the chemical maps is an aspect that emphasizes the importance of the use of hyperspectral imaging technique to guarantee a better result regarding the quantification of the adulterant, since depending on the region selected for the analysis and/or the resolution used, one would not be guaranteed the presence of the adulterant.

Fig. 3
figure 3

Distribution concentration maps for coffee husk adulterant (area of 4 mm2, resulting in 160 × 160 pixels). (a) 10% adulteration. (b) 15% adulteration. (c) 20% adulteration. (d) 30% adulteration

By calculating the concentration averages in all pixels of the image, it was possible to determine the concentration of the adulterant in the sample. Table 1 shows the expected and predicted values (with their absolute errors) for the studied adulterants. It is possible to verify that the absolute errors for quantification of the adulterants in coffee in the models developed were less than 4%, which indicated the feasibility in the application of the proposed methodology.

Table 1 Predicted values and errors for the adulterant quantification

Control Charts

In order to guarantee the efficiency of the quality characterization of coffee samples using the NIR hyperspectral imaging technique and the curve resolution method, a control chart was developed, with a total of 35 samples containing different concentrations of all types of adulterants previously mentioned. The concentration score values estimated by the MCR-ALS were used for the preparation of the chart, without the use of any restriction and quantitative value. In this case, a single matrix was formed, with all images obtained, including 10 images without any adulterant to serve as the basis of the control charts.

The limits used in the proposed control chart were based on the standard procedure for control chart development. In this procedure, a target value should be defined that in this work was the average of the scores obtained from the images without adulteration. Also, two limits should be defined: warning (average plus or minus two standard deviations—95% of the samples inside this range) and action (average plus or minus three standard deviations—99.7% of the samples inside this range). In this work, we considered as out of control the samples with score values above the warning limit. Figure 4 shows the control chart developed for detection of the adulteration in coffee. In this figure, the first 10 samples correspond to the coffee without adulterants, whose average score was 0.12. The other samples contain at least 1% of some adulterant and therefore are outside the control lines with 95% confidence. This result indicates that a simple MCR-ALS model can be developed, using only the hyperspectral images, to be used in cases where it is desired to know if the sample is adulterated or not.

Fig. 4
figure 4

Control chart for coffee adulteration

Conclusions

The use of NIR hyperspectral imaging combined with MCR-ALS algorithm for detection and quantification of various types of adulterants (coffee husks, roasted and powdered corn kernels, wood sticks, and soil) in coffee samples proved to be an effective technique, with errors smaller than 4% (w/w). An additional advantage is the identification of the adulterant present in the sample comparing the MCR recovered spectra with the reference spectra of the adulterants.

Control charts were used to analyze the quality of the samples relating the scores obtained by the MCR-ALS method to adulterated samples of coffee, using only the hyperspectral images. It was possible to identify coffee adulterated in levels above 1% in a practical and effective way.