Comparison between classical and innovative class-modelling techniques for the characterisation of a PDO olive oil

Oliveri, Paolo; Casale, Monica; Casolino, M. Chiara; Baldo, M. Antonietta; Nizzi Grifi, Fiammetta; Forina, Michele

doi:10.1007/s00216-010-4377-1

Comparison between classical and innovative class-modelling techniques for the characterisation of a PDO olive oil

Original Paper
Published: 07 November 2010

Volume 399, pages 2105–2113, (2011)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Analytical and Bioanalytical Chemistry Aims and scope Submit manuscript

Comparison between classical and innovative class-modelling techniques for the characterisation of a PDO olive oil

Download PDF

Paolo Oliveri¹,
Monica Casale¹,
M. Chiara Casolino¹,
M. Antonietta Baldo²,
Fiammetta Nizzi Grifi³ &
…
Michele Forina¹

584 Accesses
37 Citations
Explore all metrics

Abstract

An authentication study of the Italian PDO (protected designation of origin) olive oil Chianti Classico, based on near-infrared and UV–Visible spectroscopy, an artificial nose and an artificial tongue, with a set of samples representative of the whole Chianti Classico production and a considerable number of samples from a close production area (Maremma) was performed. The non-specific signals provided by the four fingerprinting analytical techniques, after a proper pre-processing, were used for building class models for Chianti Classico oils. The outcomes of classical class-modelling techniques like soft independent modelling of class analogy and quadratic discriminant analysis—unequal dispersed classes were compared with those of two techniques recently introduced into Chemometrics: multivariate range modelling and CAIMAN analogues modelling methods.

Electron Impact–Mass Spectrometry Fingerprinting and Chemometrics for Rapid Assessment of Authenticity of Edible Oils Based on Fatty Acid Profiling

Article 22 February 2019

Discriminative capacities of infrared spectroscopy and e-nose on Turkish olive oils

Article 23 May 2017

Partial Least Squares Discriminant Analysis Model Based on Variable Selection Applied to Identify the Adulterated Olive Oil

Article 13 November 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In recent years, there has been a growing interest among consumers in the safety and traceability of food products. In particular, there has been an increasing focus on the geographical origin of raw materials and finished products, for several reasons including specific sensory properties, perceived health values, confidence in locally produced products and media attention [1]. As a result of these factors, the European Union recognised and supported the differentiation of quality products on a regional basis [2], introducing an integrated framework for the protection of geographical origin for agricultural products and foodstuffs by specific regulation [3]. This regulation permits the application of the following labels to a food product: protected designation of origin (PDO), protected geographical indication and traditional speciality guaranteed.

In order to assure quality and to safeguard consumers from commercial frauds, there is an increasing need for analytical tools able to verify whether a product, sold with a specific label, is actually compatible with that claim or not.

The present work was focused on building efficient class models for the Chianti Classico PDO olive oil, potentially useful to guarantee the authenticity of this product and to avoid possible frauds, using fast and easy-to-automate methodologies. Such an achievement may be of great interest both for the Chianti Classico PDO olive oil consortium, which has been protecting and promoting the PDO appellation since its creation in 2001 [4, 5], and for consumer associations.

Classical chemical analyses, including fatty acids, sterols and triterpenic alcohols determinations, followed by a proper application of chemometric pattern recognition strategies, have widely demonstrated to be excellent tools for recognising the geographical origin of olive oils [6–8]. Nevertheless, these traditional analyses have several drawbacks, the most significant of which are low speed, the necessity for sample pre-treatments, a requirement for highly skilled personnel and destruction of the sample.

Several fast and non-destructive instrumental methods have been proposed to overcome these hurdles [9]. Among them, infrared and UV–Vis spectroscopy, artificial noses and tongues have proven to be successful analytical methods for characterisation of olive oil [10–13]. All these techniques give non-specific fingerprints of oil samples. It means that the instrumental signals are not employed to verify the existence or absence of particular chemical compounds in a sample, neither to measure single chemical/physical properties, but to obtain a comprehensive, multivariate, description of the sample, analogous to a fingerprint. Such instrumental non-specific fingerprints embody complex—multivariate—information, which may be successfully used for authentication purposes, by means of a proper multivariate pattern recognition analysis.

As for the chemometrics strategies, classification and, foremost, class-modelling techniques represent appropriate tools for verifying authenticity of a product. Discriminant classification methods—such as linear discriminant analysis (LDA) and partial least squares discriminant analysis—require at least two classes to be considered. Such techniques build a delimiter between these classes and, then, they always assign new samples to one of these classes, even in the case of samples belonging to other—non-considered—classes. Instead, class-modelling techniques study each class individually, providing an answer to the general question: “Is this sample, claiming to be of class X, really compatible with the class X model?”, which is essentially the question to be answered in food authentication [14].

In the present study, class models for the Chianti Classico PDO olive oil were built using both classical and state-of-the art chemometric methods, comparing and discussing their relative performances and their applicability.

Experimental

Olive oil samples

A total number of 57 extra virgin olive oils, all obtained from the 2008–2009 olive harvest, were analysed. This sample set comprised 23 authentic (i.e., of guaranteed provenance) oils from different producers associated to the Chianti Classico PDO consortium and 34 oils from a close production area (Maremma). These latter samples are characterised by a number of features—e.g., the olive cultivars—which are closely related to those of Chianti Classico PDO. For this reason, they can be considered as an excellent set of test samples, helpful to assess the performances of Chianti Classico models.

The samples were collected directly and personally from known and reliable farmers and oil mills.

Instrumentation and procedures

Olive oil samples were analysed by means of four fingerprinting instruments: a NIR and a UV–Vis spectrophotometers, an artificial nose and an artificial tongue.

NIR measurements were performed by means of a FT near-infrared spectrophotometer based on a polarisation interferometer (Buchi NIRFlex N-500) in the 4,000–10,000 cm⁻¹ range at 4 cm⁻¹ resolution. The samples were analysed in the transmittance mode using quartz cuvettes with 5 mm path length. Samples were measured under thermostated conditions at 35 ± 0.5 °C.

UV–Vis absorption spectra were obtained in the range 190–1,100 nm, at 1 nm resolution, using an Agilent 8453 spectrophotometer. The radiation source is a combination of a deuterium-discharge lamp for the ultraviolet wavelength range and a tungsten lamp for the visible wavelength range. The measurement cells employed were quartz cuvettes with 5 mm path length.

In order to remove oil turbidity that could affect NIR and UV–Vis spectra, all the oil samples were centrifuged at 3,000 rpm for 30 min before being analysed.

The artificial nose (headspace mass spectrometer) used in this study was assembled in our laboratory and it is formed by an automatic headspace sampler and introduction system, directly coupled to a quadrupole mass detector without any interposed chromatographic separation. An extensive description of the instrumental configuration and of the analytical procedure can be found elsewhere [15]. Signals were recorded between 50 and 150 m/z.

Artificial tongue analyses were performed by an Amel 2059 potentiostat/galvanostat equipped with an Amel 7800 interface and function generator. For the measurements, a two-electrode cell was employed. The working electrode was a Pt microdisk (r = 12.5 μm), coupled with an Ag wire pseudo-reference electrode. Olive oil samples were previously filtered through paper to remove the particulate and mixed with the room temperature ionic liquid trihexyl(tetradecyl)phosphonium bis(trifluoromethylsulfonyl)imide, at 0.1 M concentration, in order to provide sufficient conductivity to the matrix [13]. The cyclic voltammograms (three consecutive cycles) were recorded by applying the following conditions: initial potential = 0.0 V; low potential = −3.0 V; high potential = +3.0 V; final potential = 0.0 V; initial scan polarity: negative; scan rate = 20 mV s⁻¹; T = 21 ± 1 °C. The complex current/potential profiles obtained were used—in the same way as the spectral profiles—to characterise the oil samples.

All four analytical techniques allowed measurements to be taken directly from the oil samples, without complex sample pre-treatments, resulting in a significant reduction in time and costs. Two series of measurements were taken for all the samples, randomising the order of analysis. Finally, each sample was identified with the average of the two replicated signals, acquired from each instrumental technique.

Data analysis

Digital signals were organised in four data matrices, one for each instrumental technique. Each matrix was arranged in 57 rows—corresponding to the samples analysed—and as many columns as signal elements, i.e., instrumental variables.

NIR spectra were reduced to the range 4,400–8,900 cm⁻¹ and UV–Vis spectra were reduced to the range 290–730 nm, in order to eliminate regions non-informative and/or characterised by a low signal-to-noise ratio. For the same reasons, electronic nose mass spectra were reduced to the range 45–90 m/z. Then, all these signals have been pre-treated by the standard normal variate transformation [16] and, subsequently, by column centering. Electronic tongue cyclic voltammograms have been pre-treated by computing the differences of the current values evaluated at the same potential (forward scan–backward scan) [13] and, subsequently, by column autoscaling.

As a first step, principal component analysis (PCA) has been applied as an unsupervised exploratory tool, useful to visualise the actual sample distribution in the multivariate space [17].

Then, a number of classical and novel supervised class-modelling methods have been performed, in order to build models for verifying authenticity of the Chianti Classico PDO olive oil.

Soft independent modelling of class analogy (SIMCA) was the first class-modelling technique introduced into chemometrics [18]; the central feature of this method is the application of PCA to the sample category studied (e.g., a PDO food product), generally after within-class autoscaling or centering. SIMCA models are defined by the range of the sample scores on a selected number of low-order principal components (PCs) referred to as the SIMCA inner space. Model shapes therefore correspond to rectangles (two PCs), parallelepipeds (three PCs) or hyper-parallelepiped (more than three PCs).

Unequal dispersed classes (UNEQ), originating in the work of Hotelling [19], was introduced into chemometrics by Derde and Massart [20]. This technique derives from quadratic discriminant analysis; it is based on the hypothesis of a multivariate normal distribution in each category studied and, consequently, on the use of T ² statistics to define a class space. The UNEQ model is the class centroid, i.e., the vector of the mean values of each variable, computed from the objects of the modelled class. The boundary of the class space around the centroid is an ellipse (two variables), an ellipsoid (three variables) or a hyper-ellipsoid (more than three variables). The dispersion of a class space is defined by the critical value of the T ² statistics at a selected confidence level. UNEQ should be applied in cases when the ratio between the number of objects in a given category and the number of the variables measured is three or greater. In cases involving many variables (such as spectral and voltammetric data), it is possible to apply UNEQ following a preliminary reduction in variable number by PCA.

In comparison with these two classical techniques, two methods recently developed and presented have been applied in the present study: multivariate range modelling (MRM) and CAIMAN analogues modelling methods (CAMM).

MRM [21] follows a simple and intuitive way of building class models by employing the ranges of the predictors. In order to take into account intercorrelations between variables in the dataset, several additional features are computed as linear combinations of the original predictors and employed to build the models: LDA canonical variables represent a typical example. A basic feature of MRM is the capability of providing, by definition, class models with 100% sensitivity: this property is very important when a model is built for a food protection consortium, the production of whose affiliated producers must be all recognised and accepted by the model. A further advantage of MRM can be found in its outcome, which is easily understandable and interpretable also by people with a limited knowledge of multivariate analysis.

CAMM [22] is a family of powerful class-modelling techniques which builds the class models using distances (leverages or Mahalanobis distances from the class centroids) as predictors. Such new variables may be used separately or in combination with the original predictors; the class models and the class spaces are defined as in the case for UNEQ.

Data analysis was performed by means of the chemometric package V-PARVUS [23].

Results and discussion

Exploratory analysis

Figures 1, 2, 3 and 4 show the PCA score plots of Chianti Classico and Maremma olive oil samples, corresponding to the four different instrumental data matrices. The Cartesian axes correspond, in each case, to the PCs presenting the greatest Fisher weight, which is an estimation of the ratio between inter-classes and intra-class variance [24], used as a measure of the power for distinguishing between the two categories of oil considered.

From the examination of these plots, it clearly arises that the two spectroscopic techniques allow a direct and well-resolved differentiation of the Chianti Classico oils from the Maremma samples. In fact, the samples corresponding to oils of the two categories are evidently clustered and distinguishable in the space defined by the first two PCs (Figs. 1 and 2).

On the other hand, as for artificial nose and tongue data, the distinction between Chianti Classico and Maremma samples is not perfectly defined and the samples of the two classes partially overlap. Anyway, two regions characterised respectively by a relatively high density of objects belonging to one of the two classes are noticeable on the plots (Figs. 3 and 4).

This suggests that also artificial nose and tongue signals contain information potentially useful for characterisation of the Chianti Classico olive oil although, in this case, a higher data complexity degree is evident.

Class-modelling

Seeing as in all four data sets, corresponding to the four analytical techniques, the number of variables was very high compared to the number of samples, class-modelling methods were always applied working on the scores of low-order principal components (two PCs for UV–Vis and NIR spectroscopy, seven PCs for artificial nose and tongue).

Model validation was performed by a cross-validation scheme with five cancellation groups. The evaluation parameters computed are sensitivity (i.e., the percentage of samples of the modelled class correctly accepted by the class model), and specificity (i.e., the percentage of samples extraneous to modelled class correctly rejected by the class model). Sensitivity may be considered as an experimental evaluation of the confidence level of a model, while specificity may be considered as an experimental measurement of the model rejection ability of non-pertinent samples. In order to take into account the balance between these two parameters, a model efficiency value was computed as the geometric mean of sensitivity and specificity.

As for SIMCA, UNEQ and CAMM, class models for Chianti Classico PDO olive oil have been built at a 95% confidence level. After that, new models forced to retain all the objects of the training set of the modelled class (corresponding to 100% sensitivity) have been developed as well. MRM provides, by definition, models with 100% sensitivity. Therefore, MRM outcomes should be compared with those obtained with the other methods for the forced models.

Tables 1, 2, 3 and 4 report the modelling results for the characterisation of the Chianti Classico PDO olive oil with the four analytical techniques employed in this study, respectively.

Table 1 Modelling results of NIR data

Full size table

Table 2 Modelling results of UV–Vis data

Full size table

Table 3 Modelling results of artificial nose data

Full size table

Table 4 Modelling results of artificial tongue data

Full size table

As expected from the results of the exploratory analysis, the spectroscopic techniques provided the most efficient models, with cross-validation efficiency values generally higher than 90%. Using spectral data, all the class-modelling methods investigated performed very well, not only at the 95% confidence level but also in the case of the forced models. Actually, it is quite common that such models, enlarged to accept all the samples belonging to the modelled class (100% sensitivity), fall to very low—often unacceptable for practical purposes—specificity values. Conversely, the forced models built for the Chianti Classico oil from NIR and UV–Vis data showed specificities generally higher than 95%, thus providing very satisfactory efficiencies (Tables 1 and 2).

Artificial nose and tongue data, due to the higher complexity of the information embodied, as noticed by the exploratory analysis, allowed to obtain models that are characterised by a generally lower efficiency (Tables 3 and 4).

Nevertheless, it is possible to notice that, for both the data sets, there is a sensible difference between the performances of MRM and CAMM and those of UNEQ and SIMCA.

In particular, UNEQ and SIMCA models were characterised by cross-validation efficiency values lower than 50%: such models have no practical interest. Conversely, as for artificial nose data, MRM provided a model with well-balanced sensitivity and specificity values in cross-validation, and 70% efficiency. Instead, in the case of artificial tongue data, the most efficient model was obtained by CAMM (80% cross-validation efficiency).

Figures 5, 6, 7 and 8 show the Coomans’ plots corresponding to the CAMM models obtained for the four analytical techniques, respectively. The graphs confirm that CAMM is able provide efficient models for the characterisation of Chianti Classico PDO olive oil using data from any of the analytical techniques investigated in this study.

From the explorative and modelling results, it arises that the spectroscopic techniques worked better than artificial nose and tongue for characterising the Chianti Classico PDO olive oil. Anyway, the comparison of the modelling outcomes show that the novel methods applied (MRM and, mainly, CAMM) are effectively capable of extracting the useful information from complex data and of providing acceptable models also in the case of overlapping distributions.

Conclusions

The present study proves the usefulness of four fingerprinting techniques in the authenticity verification of an Italian PDO olive oil. In particular, it is possible to conclude that, among them, NIR and UV–Vis spectroscopy provided directly the most valuable information for the problem studied. Conversely, the artificial nose and the artificial tongue employed for the same purpose supplied more complex outcomes. Furthermore, four class-modelling methods (UNEQ, SIMCA, MRM and CAMM) were compared. This comparison showed that the classical strategies, UNEQ and SIMCA, worked very well on well-resolved data, but they failed in supplying efficient models in the case of complex data distribution. Instead, the novel methods, MRM and CAMM, are capable of extracting the useful information from complex data, so that they should be preferred when partially overlapping classes are being modelled.

References

Luykx DMAM, van Ruth SM (2008) Food Chem 107:897–911
Article CAS Google Scholar
Dimara E, Skuras D (2003) Eur J Market 37:690–705
Article Google Scholar
European Council Regulation No 510/2006 of 20 Mar 2006
European Commission Regulation No 2446/2000 of 6 Nov 2000
Ministero delle Politiche Agricole e Forestali, Provvedimento of 4 Dec 2000, Gazzetta Ufficiale della Repubblica Italiana No 13 of 17 Jan 2001
Armanino C, Leardi R, Lanteri S, Modi G (1989) Chemom Intell Lab Syst 5:343–354
Article CAS Google Scholar
Ollivier D, Artaud J, Pinatel C, Durbec JP, Guérère M (2006) Food Chem 97:382–393
Article CAS Google Scholar
Lerma-García MJ, Herrero-Martínez JM, Ramis-Ramos G, Simó-Alfonso EF (2008) Food Chem 108:1142–1148
Article Google Scholar
Valcárcel M, Cárdenas S (2005) Trends Anal Chem 24:67–74
Article Google Scholar
Sinelli N, Casiraghi E, Tura D, Downey G (2007) J Near Infrared Spectrosc 16:335–342
Article Google Scholar
Forina M, Boggia R, Casale M (2007) Ann Chim (Rome) 97:615–633
Article CAS Google Scholar
Cosio MS, Ballabio D, Benedetti S, Gigliotti C (2006) Anal Chim Acta 567:202–210
Article CAS Google Scholar
Oliveri P, Baldo MA, Daniele S, Forina M (2009) Anal Bioanal Chem 395:1135–1143
Article CAS Google Scholar
Forina M, Oliveri P, Lanteri S, Casale M (2008) Chemom Intell Lab Syst 93:132–148
Article CAS Google Scholar
Cerrato-Oliveros C, Boggia R, Casale M, Armanino C, Forina M (2005) J Chromatogr A 1076:7–15
Article CAS Google Scholar
Barnes RJ, Dhanoa MS, Lister SJ (1989) Appl Spectrosc 43:772–777
Article CAS Google Scholar
Jolliffe IT (2002) Principal component analysis. Springer series in statistics, 2nd edn. Springer, New York
Google Scholar
Wold S, Sjöström M (1977) Chemometrics: theory and applications. In: Kowalski BR (ed) Chemometrics: theory and application, ACS Symposium Series 52. American Chemical Society, Washington, pp 243–282
Google Scholar
Hotelling H (1947) Multivariate quality control. In: Eisenhart C, Hastay MW, Wallis WA (eds) Techniques of statistical analysis. McGraw-Hill, New York, pp 111–184
Google Scholar
Derde MP, Massart DL (1986) Anal Chim Acta 184:33–51
Article CAS Google Scholar
Forina M, Oliveri P, Casale M, Lanteri S (2008) Anal Chim Acta 622:85–93
Article CAS Google Scholar
Forina M, Casale M, Oliveri P, Lanteri S (2009) Chemom Intell Lab Syst 96:239–245
Article CAS Google Scholar
Forina M, Lanteri S, Armanino C, Casolino C, Casale M, Oliveri P (2009) V-PARVUS 2009, University of Genoa, Genoa. http://www.parvus.unige.it. Accessed 2 Jan 2010
Harper AM, Duewer DL, Kowalski BR, Fashing JL (1977) In: Kowalski BR (ed) Chemometrics: theory and application, ACS Symposium Series 52. American Chemical Society, Washington, pp 14–52
Google Scholar

Download references

Acknowledgments

Financial support by the Italian Ministry of University and Research (MIUR) and by the University of Genoa is gratefully acknowledged.

Author information

Authors and Affiliations

Department of Pharmaceutical and Food Chemistry and Technology, University of Genoa, Via Brigata Salerno 13, 16147, Genoa, Italy
Paolo Oliveri, Monica Casale, M. Chiara Casolino & Michele Forina
Department of Physical Chemistry, University of Venice, Calle Larga S. Marta 2137, 30123, Venice, Italy
M. Antonietta Baldo
The Chianti Classico DOP Olive Oil Consortium, Via Scopeti 155, 50026 San Casciano in Val di Pesa, Florence, Italy
Fiammetta Nizzi Grifi

Authors

Paolo Oliveri
View author publications
You can also search for this author in PubMed Google Scholar
Monica Casale
View author publications
You can also search for this author in PubMed Google Scholar
M. Chiara Casolino
View author publications
You can also search for this author in PubMed Google Scholar
M. Antonietta Baldo
View author publications
You can also search for this author in PubMed Google Scholar
Fiammetta Nizzi Grifi
View author publications
You can also search for this author in PubMed Google Scholar
Michele Forina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paolo Oliveri.

Additional information

Published in the special issue Chemometrics (VII Colloquium Chemiometricum Mediterraneum) with Guest Editors Marcelo Blanco, Juan M. Bosque-Sendra and Luis Cuadros-Rodríguez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oliveri, P., Casale, M., Casolino, M.C. et al. Comparison between classical and innovative class-modelling techniques for the characterisation of a PDO olive oil. Anal Bioanal Chem 399, 2105–2113 (2011). https://doi.org/10.1007/s00216-010-4377-1

Download citation

Received: 30 July 2010
Revised: 22 October 2010
Accepted: 24 October 2010
Published: 07 November 2010
Issue Date: February 2011
DOI: https://doi.org/10.1007/s00216-010-4377-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Comparison between classical and innovative class-modelling techniques for the characterisation of a PDO olive oil

Abstract

Similar content being viewed by others

Electron Impact–Mass Spectrometry Fingerprinting and Chemometrics for Rapid Assessment of Authenticity of Edible Oils Based on Fatty Acid Profiling

Discriminative capacities of infrared spectroscopy and e-nose on Turkish olive oils

Partial Least Squares Discriminant Analysis Model Based on Variable Selection Applied to Identify the Adulterated Olive Oil

Introduction