A Metric to Improve the Robustness of Conformal Predictors in the Presence of Error Bars

Murari, Andrea; Talebzadeh, Saeed; Vega, Jesús; Peluso, Emmanuele; Gelfusa, Michela; Lungaroni, Michele; Gaudio, Pasqualino

doi:10.1007/978-3-319-33395-3_8

Andrea Murari¹⁷,
Saeed Talebzadeh¹⁸,
Jesús Vega¹⁹,
Emmanuele Peluso¹⁸,
Michela Gelfusa¹⁸,
Michele Lungaroni¹⁸ &
…
Pasqualino Gaudio¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9653))

Included in the following conference series:

Symposium on Conformal and Probabilistic Prediction with Applications

1867 Accesses
1 Citations

Abstract

Conformal predictors, currently applied to many problems in various fields determine precise levels of confidence in new predictions on the basis only of the information present in the past data, without making recourse to any assumptions except that the examples are generated independently from the same probability distribution. In this paper, the robustness of their results is assessed for the cases in which the data are affected by error bars. This is the situation typical of the physical sciences, whose data are often the results of complex measurement procedures, unavoidably affected by noise. Assuming the noise presents a normal distribution, the Geodesic Distance on Gaussian Manifolds provides a statistical principled and quite effective method to handle the uncertainty in the data. A series of numerical tests prove that adopting this metric in conformal predictors improves significantly their performance, compared to the Euclidean distance, even for relatively low levels of noise.

Access provided by Autonomous University of Puebla. Download conference paper PDF

From Conformal to Probabilistic Prediction

Modifications to p-Values of Conformal Predictors

Robust Geodesic Regression

Article 05 January 2022

Keywords

1 Conformal Predictors and Measurement Errors

Machine-learning methods work often very well and have found many applications in both the public and the private sector. On the other hand, the reliability of their performance is typically proven asymptotically and is therefore not very useful in practice. Conformal predictors, which perform competitively in terms of success rates, include from their conception simple and useful measures of confidence [1]. Conformal prediction can be based on any technique of point prediction for classification or regression, including support-vector machines, decision trees, neural networks and Bayesian methods. Starting from the point prediction tool, the conformal predictor consists of building a nonconformity measure, which determines how unusual an example is relative to previous examples. The conformal algorithm, based on the statistical concept of the p-values, turns this nonconformity measure into prediction regions. Given a nonconformity measure, the conformal algorithm produces a prediction region Uε for every probability of error ε. The region Uε is a (1−ε)-prediction region; it classifies the next example with probability at least 1−ε. Therefore conformal predictors are conservatively valid, which means that the probability they make a mistake when their output is at confidence level 1−ε is not greater than ε.

In most of the non-conformity measures utilised by conformal predictors, the Euclidean distance is implicitly assumed to be the proper metric to adopt in the calculation of the non-conformity measure and the p-values. The Euclidean distance has a precise geometrical meaning and a very long historical pedigree. However, it implicitly requires considering all data as single infinitely precise values. This assumption can be appropriate in other applications but it is obviously not the case in physics, since all the measurements typically present an error bar. An alternative idea is to use a new distance between data, which would take into account the measurement uncertainties. The causes of uncertainties in the measurements are typically many, which from a statistical point of view can be considered random variables. As a consequence, their global contribution can be often modelled as a noise of normal distribution. The idea, behind the approach proposed in this paper, consists therefore of considering the measurements not as points, but as Gaussian distributions [2]. Modelling measurements not as point values, but as Gaussian distributions, requires defining a distance between Gaussians. This distance must be the Geodesic on the Gaussian Manifold (GDGM) of the measurements and can be expressed as a closed formula (see Sect. 3) [3]. As shown in the rest of the paper, adopting this geodesic distance can increase significantly the accuracy of traditional conformal predictors, even when the data are affected by a very limited level of noise.

With regard to the structure of the paper, next Section provides a short introduction to the general framework of conformal prediction. The mathematical background to the main mathematical tool introduced in the paper: the Geodesic Distance on Gaussian Manifolds, is the subject of Sect. 3. The proposed method is assessed with a series of numerical tests using a toy model described in Sect. 4. Section 5 reports in detail the results of the numerical tests. Conclusions and lines of future work are provided in the last Section of the paper.

2 The Framework of Conformal Prediction for Classification

The task of classification basically consists of attributing objects to different classes. Mathematically this can be formalised by considering successive ordered pairs (x ₁, y₁), (x ₂, y₂)…….which are called examples. Each example consists of an object x _i and its label y_i, where the former represents the feature vector that describes the object i. The objects are elements of a measurable space X called the object space; the labels are elements of a measurable space Y called the label space. It is common practice to adopt a more compact notation, according to which z_i indicate the ordered pair (x _i, y_i), and Z := X xY is defined as the example space.

Many machine learning tools are available to perform classification. On the other hand, as mentioned earlier, the vast majority of them cannot easily quantify the quality of their predictions. On the contrary, conformal predictors have been conceived explicitly to quantify the reliability of their predictions. They achieve this on the basis of the past examples. To this end, for each new sample to classify, it is necessary to measure how different the new one is from the old examples. In this perspective, a nonconformity measure is defined, which allows calculating a nonconformity score to estimate how different a new example is from a bag of old ones. A bag of size n ϵ N is a collection of n elements some of which may be identical. In this paper, the notation < z₁,…, z_n > indicates a bag of n elements.

Given a nonconformity measure A and a bag < z₁,…,z_n > , the nonconformity score can be calculated as:

$$ \alpha_{i} \text{ := }A\left( {\left\langle {\text{z}_{1} , \ldots ,\text{z}_{i - 1} ,\text{z}_{i + 1} , \ldots ,\text{z}_{n} } \right\rangle ,\text{z}_{i} } \right) $$

(1)

for each example z_i in the bag. Because nonconformity measures are not absolute but relative, the numerical value of $ \alpha_{i} $ does not, by itself, determines how unusual z_i is according to the measure A. To really quantify how unusual a sample is, it is necessary to compare $ \alpha_{I} $ with the nonconformity measures $ \alpha_{j} $ of the other members of the bag. The p-value is a convenient and statistically sound way of calculating how anomalous a new example is. By definition the p-value is the fraction

$$ Pval = \frac{{ \# \,\{ j = 1, \ldots ,n:\alpha_{j} \ge \alpha_{i} \} }}{n} $$

(2)

This indicator, which lies between 1/n and 1, is the fraction of the examples in the bag as non conforming as z_i and in literature is called p-value of the element z_i (p _val(z _i)). The symbol “#” stands in fact for the number of elements “j” in the collection having a nonconformity score higher or at least the same nonconformity of the element “i”. The lower the p-value, i.e. the closer to its lower bound 1/n (“j” includes “i” in fact) for large n, the more non conforming z_i is and the more likely it can be considered as an outlier; this means that z_i is not representative of the typical member of the bag. If the p-value is large, i.e. close to its upper bound 1, then z_i is very conforming or very representative of the typical member of the bag. The new sample is attributed to the class with the highest p-value.

On the basis of the p-values, conformal predictors allow calculating, for each new classification, two indicators, confidence and credibility, which quantify the reliability of the prediction. Credibility is defined as the largest p-value; confidence is defined as 1-2^nd largest p-value. Confidence can be interpreted as the probability that the prediction, corresponding to the maximal p-value, is correct. A low credibility, typically less than 0.05, intuitively means that either the training set is non random or the test object is not representative of the training set. If the maximum p-value appears in more than one class, an ambiguity is present and the algorithm is not able to classify the sample. It is important to emphasize that confidence and credibility of the prediction play an analogous role to the observed level of significance in statistical parameter tests.

3 Geodesic Distance on Gaussian Manifolds

As mentioned in the previous section, in the natural sciences the data available are typically the result of experimental measurements. In this context, all measurements are affected by uncertainties referred to as error bars. The sources of this uncertainty are normally quite many and therefore it is more than reasonable to assume that the pdf of the noise is normal. Each measurement can therefore be modelled as a probability density function (pdf) of the Gaussian type, determined by its mean μ and its standard deviation σ:

$$ p(x;\mu ,\sigma ) = \frac{1}{{\sigma \sqrt {2\pi } }}exp\left[ { - \frac{{(x - \mu )^{2} }}{{2\sigma^{2} }}} \right] $$

(3)

It is normal practice to assume that the experimental measured value is the mean of the pdf, since this is the most likely value of the pdf. The standard deviation can be determined independently from the knowledge of the instrumentation.

The set of normal distributions can therefore be modelled as a two dimensional space, or better a two dimensional manifold, parameterized by $ \mu $ and $ \sigma $. Modelling measurements not as point values, but as Gaussian distributions, requires defining a distance between Gaussians. The most appropriate definition of distance between Gaussian distributions is the geodesic distance (GDGM), on the probabilistic manifold containing the data, which is not a Euclidean but a Riemannian space. This geodesic distance on the Gaussian manifold can be calculated using the Fischer-Rao metric [3, 4]. For two univariate Gaussian distributions (p ₁(x|μ ₁, σ ₁)) and (p₂(x|μ₂, σ₂)), parameterised by their mean $ \mu_{i} $ and standard deviations σ _i(i = 1, 2), the geodesic distance GDGM is given by:

$$ GD(p_{1} ||\,p_{2} )= \sqrt 2 ln\frac{1 + \delta }{1 - \delta } = 2\sqrt 2 tanh^{ - 1} \delta , where \,\delta = [\frac{{(\mu_{1} - \mu_{2} )^{2} + 2(\sigma_{1} - \sigma_{2} )^{2} }}{{(\mu_{1} - \mu_{2} )^{2} + 2(\sigma_{1} + \sigma_{2} )^{2} }}]^{{\frac{1}{2}}} $$

(4)

As will be shown in detail in the next sections, the replacement of the Euclidean distance with the GDGM improves significantly the robustness of the classification compared to the case of the Euclidean distance. In Fig. 1 a graphical example of the improvement obtained using the metric in Eq. (4) is shown. Considering a Cartesian coordinate system (μ, σ), where each point represents a Gaussian distribution, the Euclidean distance between the four points, so between the four distributions, is higher between the two wider distributions. On the other hand, considering the Geodesic Distance, the lower distance is obtained considering the wider distributions. This behaviour reflects the physical interpretation according to which physical quantities having higher error bars, are to be considered closer and more similar than those with narrower error bars.

4 A Toy Model

To exemplify and prove the usefulness of the method proposed in this paper, a series of numerical test has been performed. They are based on a toy model already introduced in [5]. The simplicity of the model allows appreciating both the nature of the problem and the advantages of adopting the proposed metric, the GDGM. The classification task consists of classifying points on a straight line, on which three classes have been defined. The problem is represented graphically in Fig. 2. The aim is to classify the new point Q with confidence and credibility.

For the purpose of this example, the classification is based on the nearest neighbour. Mathematically, given a “bag” {z ₁,…,z _n−1}, where each z _i consists of a feature vector x _i and a non-numerical label y _i, when a new example z _n = (x _n, y _n) becomes available for classification, its feature vector x _n is known but its label y _n is not. The nearest-neighbour method finds the x _i closest to x _n and its label y _i becomes the prediction of y _n. A natural way to measure the nonconformity of the new example z _n with respect to the old examples z _i consists of comparing x’s distances to old objects with the same label to its distance to old objects with a different label. For example, the nonconformity scores can be defined as:

$$ \alpha_{i} = \frac{{{\text{d}}_{\text{sl}} }}{{{\text{d}}_{\text{dl}} }} $$

(5)

$$ {\text{d}}_{\text{sl}} = \hbox{min} \left\{ {\left| {x_{j} - x_{i} } \right|: 1 \le j \le n \,\& \,j \ne i \,\& \,y_{j} = y_{i} } \right\} $$

$$ {\text{d}}_{\text{dl}} = \hbox{min} \left\{ {\left| {x_{j} - x_{i} } \right|: 1 \le j \le n \,\& \, j \ne i \,\& \,y_{j} \ne y_{i} } \right\} $$

For the new point Q = 14.85 shown in Fig. 2 (the non-conformity measurement is presented in Table 1), therefore the P values credibility and confidence assume the values 1 and 0.9844, respectively; and point Q belongs to Class C.

Table 1. Non-conformity measurements for point Q = 14.85

Full size table

In the previous example, the conformity measure of Eq. (5) has been calculated using the Euclidean distance between the various points. All the derived quantities are therefore also based on this metric. In the case of measurements affected by noise, the Euclidean metric is not adequate and adopting the GDGM provides several improvements as discussed in the next section.

5 Results of the Numerical Tests

In order to assess the potential of the GDGM metric to counteract the effect of noise, a series of systematic tests has been performed using the toy model introduced in the previous section. To this end, a series of points have been automatically generated along the straight line of Fig. 2. These are to be considered the right values of the physical quantity to measure. Then Gaussian noise, with zero mean and standard deviations equals to a percentage (10 %,20 %,…) of the value itself, has been added to the previously generated points. Adding this noise to the data provides the actual values to be considered as the available measurements, affected by additive noise of Gaussian distribution. These points have been then classified with the nonconformity measure based on the next neighbour criterion using both the Euclidean distance and the GDGM. The results have been reported in Table 2 for the Euclidean distance as metric and in Table 3 for the GDGM as metric.

Table 2. Classification using the Euclidean distance to calculate the nearest neighbour. The first column reports the accuracy (Acc.); the second the credibility (Cred.) and the third the confidence (Conf). The following column reports the same quantities but for different levels of noise. The top of the table reports the average values for all the 50 points.

Full size table

Table 3. Classification using the Geodesic Distance on Gaussian Manifolds distance to calculate the nearest neighbour. The first column reports the accuracy (Acc.); the second the credibility (Cred.) and the third the confidence (Conf). The following column reports the same quantities but for different levels of noise. The top of the table reports the average values for all the 50 points.

Full size table

The results reported in Tables 2 and 3 indicate that the GDGM provides a clear improvement in the success rate of the classification. Table 2 shows how the performance of conformal predictors degrade with increasing levels of noise. It is important also to notice how the indicators of the quality of the prediction, confidence and credibility, tend to overestimate the reliability of the classification when significant level of noise is present. Table 3 reports the clear improvement in both performance and reliability of the quality indicators when the Euclidean distance is replaced with GDGM. Another important consideration is the fact that, adopting the GDGM metric does not cause any degradation of performance when the data are not affected by noise.

6 Conclusions

In many applications of conformal predictors, the Euclidean distance is explicitly or implicitly adopted as the right metric. In the case of experimental measurements typical of the physical sciences, the data are affected by noise of normal distribution. In this situation, the GDGM proves to be a better metric, to be used in the definition of the non-conformity measure. The calculation of the nonconformity measure and of the p-values using the GDGM provides significantly more reliable classifications, by reducing the adverse effects of the noise. The reported results using the GDGM have been obtained using a desktop computer with two Xeon E5520 @2.27 GHz processors and 24 GB of RAM, and required an average of one minute for each test performed, for a total of 50 min for all 50 points. The computational cost is therefore very similar to the one required to perform the calculations with the Euclidean distance.

With regard to future developments, it would be important to apply the same approach to different pdfs: particularly relevant would be the case of the Poisson distribution, since in practice many detectors work in photon counting or particle counting mode. Another very interesting application would be the case in which the pdf of the noise is not known. This situation has practical applications because in many experimental situations the uncertainties in the measurements can be quantified with an interval but without any additional specification. Therefore the real value is expected to fall in a certain interval but no additional information is available. In this case the implementation of an appropriate form of uncertain probability is expected to produce improvements in the classification of conformal predictors comparable to the case of the GDGM for the case of measurements affected by Gaussian noise.

In terms of practical applications, the mathematics of conformal predictors can be applied to most classifiers, including Fuzzy ones [6]. Therefore the approach can be of extreme help in all the cases, such as disruptions in Tokamaks, where classification is a particularly problematic and difficult task also due to the uncertainties in the measurements [7, 8].

References

Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)
MATH Google Scholar
Katz, J.O., Rohlf, F.J.: Function-point cluster analysis. Syst. Zool. 22(3), 295–301 (1973)
Article Google Scholar
Amari, S., Nagaoka, H.: Methods of Information Geometry. Oxford University Press, Oxford (1993)
MATH Google Scholar
Murari, A., et al.: Nucl. Fusion 53, 033006, 9 (2013)
Google Scholar
Vega, J., et al.: Rev Sci. Instrum. 81, 10E118 (2010)
Google Scholar
Murari, A., et al.: Ann. Math. Artif. Intell. 74(1), 155–180 (2015)
Article MathSciNet Google Scholar
Murari, A., et al.: Nucl. Fusion 49, 055028, 11 (2009)
Google Scholar
Murari, A., Peluso, E., Gelfusa, M., Lungaroni, M., Gaudio, P.: How to handle error bars in symbolic regression for data mining in scientific applications. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS, vol. 9047, pp. 347–355. Springer, Heidelberg (2015). doi:10.1007/978-3-319-17091-6_29
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Consorzio RFX (CNR, ENEA, INFN), Acciaierie Venete SpA, Universita’ di Padova, Corso Stati Uniti 4, 35127, Padua, Italy
Andrea Murari
University of Rome “Tor Vergata”, Via del Politecnico 1, 00133, Rome, Italy
Saeed Talebzadeh, Emmanuele Peluso, Michela Gelfusa, Michele Lungaroni & Pasqualino Gaudio
Asociación EURATOM/CIEMAT para Fusión, 28040, Madrid, Spain
Jesús Vega

Authors

Andrea Murari
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Talebzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Vega
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuele Peluso
View author publications
You can also search for this author in PubMed Google Scholar
Michela Gelfusa
View author publications
You can also search for this author in PubMed Google Scholar
Michele Lungaroni
View author publications
You can also search for this author in PubMed Google Scholar
Pasqualino Gaudio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emmanuele Peluso .

Editor information

Editors and Affiliations

University of London, Egham, United Kingdom
Alexander Gammerman
University of London, Egham, United Kingdom
Zhiyuan Luo
CIEMAT, Madrid, Spain
Jesús Vega
University of London, Egham, United Kingdom
Vladimir Vovk

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Murari, A. et al. (2016). A Metric to Improve the Robustness of Conformal Predictors in the Presence of Error Bars. In: Gammerman, A., Luo, Z., Vega, J., Vovk, V. (eds) Conformal and Probabilistic Prediction with Applications. COPA 2016. Lecture Notes in Computer Science(), vol 9653. Springer, Cham. https://doi.org/10.1007/978-3-319-33395-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-33395-3_8
Published: 17 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33394-6
Online ISBN: 978-3-319-33395-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Metric to Improve the Robustness of Conformal Predictors in the Presence of Error Bars

Abstract

Similar content being viewed by others

From Conformal to Probabilistic Prediction