Introduction

Many parameters are defined by international scientific committees to describe the performances of a measuring system functioning according to a given measurement procedure. A careful inspection of several international and national documents in matters of measurement reveals discrepancies and inconsistencies in the definition of some fundamental terms. Although explicit definitions of trueness, precision and accuracy are provided in the International Vocabulary of Metrology 2012 (henceforth VIM), the understanding of these basic terms still proves difficult, especially in view of the additional specific noise brought by single languages. Current meaning of accuracy, trueness and precision is debated in the papers of Patriarca et al. [1] and of Menditto et al. [2]. Conclusions of the paper of Menditto et al. [2] allow some latitude to discussion, because authors recognize that these three basic concepts are not yet uniquely defined.

This paper discusses the meaning currently attributed by various scientific organisations to trueness, precision and accuracy, assuming as a reference the International Vocabulary of Metrology 2012 by the Joint Committee for Guides in Metrology (JCGM) [3], which is largely internally consistent and provide a solid base in the field of metrology. Moreover, BIPM (Bureau International des Poids et Mesures), IEC (International Electro technical Commission), IFCC (International Federation of Clinical Chemistry), ILAC (International Laboratory Accreditation Cooperation), ISO (International Organization for Standardization), IUPAC (International Union of Pure and Applied Chemistry), IUPAP (International Union of Pure and Applied Physic) and OIML (Organisation Internationale de Métrologie Légale) are members of JCGM, and this would ensure a wide and correct dissemination of metrological definitions in the world, albeit an examination of specific documents (standards, guides and so on) still reveals discrepancies and inconsistencies damaging the standardization process of terms and procedures. Especially in view of the necessity of validation of measurement methods (fields of metrology and of analytical chemistry), explicitly expressed by international organisms for standardization, and also requested in specific national and European regulations, a revision of some basic concepts still seems opportune, if not strictly necessary. Moreover, to promote a full harmonization between reports from different scientific organisms, all documents should be tuned to the International Vocabulary of Metrology. Attention must also be paid to consider a strict contextualized language, since many terms can assume different meaning in different areas of science, while we need to tune the reasoning on a circumscribed field with a well-defined vocabulary. The paper focuses the attention on contextualized terminology to avoid inconsistencies and confusion of meanings of specific widespread basic terms.

Current definitions of trueness, precision and accuracy

Let us now examine relevant definitions—by international scientific institutions—of trueness, precision and accuracy starting by these provided by JCGM 200:2012 [3].

  1. 1.

    According to the VIM [3], we have:

    • Measurement trueness: “Closeness of agreement between the average of an infinite number of replicate measured quantity values and a reference quantity value”;

    • Measurement precision: “Closeness of agreement between indications or measured quantity values obtained by replicate measurements on the same or similar objects under specified conditions”;

    • Measurement accuracy: “Closeness of agreement between a measured quantity value and a true quantity value of a measurand.”

  2. 2.

    According to the ISO 5725-1:1994 standard [“Accuracy (trueness and precision) of measurement methods and results—part 1: General principles and definitions”] [4], we have:

    • Trueness: “Closeness of agreement between the average value obtained from a large set of test results and an accepted reference value”;

    • Precision: “Closeness of agreement between independent test results obtained under stipulated conditions”;

    • Accuracy: “Closeness of agreement between a test result and an accepted reference value.”

Although VIM [3] is the reference document in metrology, it must be evidenced that ISO standards are largely followed worldwide, specially in view of laboratories accreditation procedure. We can cite ISO 3534-1:2006 [5], ISO/IEC 17025:2005 [6] and ISO 15189:2007 [7] of particular importance for measurement in chemistry. For this reason, the ISO 5725-1:1994 standard must be carefully examined in this critical overview of metrological concepts.

  1. 3.

    According to the DIN 55350-13:1987-07 standard [8] (DIN = Deutsches Institut für Normung), as an example of national definitions, we have:

    • “Trueness is the qualitative term for the closeness of agreement between the expected value (the arithmetic mean obtained from a large series of test results) of the test results and an accepted reference value”;

    • “Precision is the qualitative term for the closeness of agreement between independent test results obtained under stipulated conditions. Precision depends only on the distribution of random errors and does not relate to the true value or the accepted reference value. The measure of precision is expressed as the standard deviation of the test results”;

    • “The term accuracy consists of two criteria, the precision and the trueness.”

From these definitions, we can infer that:

  • Trueness deals with the systematic error of measurement,

  • Precision deals with the random (or accidental or casual or indeterminate) error of measurement,

  • Accuracy deals with the total (systematic and random) error of measurement.

Being:

  • X R the accepted reference value of the quantity X,

  • X M the mean value of repeated measurements of X,

  • X 1, X 2, X 3 ,…X i, …, X n−1, X n the single values of repeated measurements of X,

From the previous definition of each quality parameter, we need to define an operational quantity to obtain a number as quality descriptor, namely:

  • for Trueness: the bias (also called measurement bias),

  • for Precision: the standard deviation (or the variance or the coefficient of variation),

  • for Accuracy: the single deviation,

and, then, we can calculate each operational quantity as:

  • Bias = X M −X R;

  • Accidental deviation = X i −X M from which the standard deviation is calculated with the usual common formula;

  • Single deviation = X i −X R.

Since the true value of a measurement is always unknown (VIM item 2.11 in [3]), in practice an accepted reference value of a measurand (also named conventional true value or assigned value, namely X R) is used when available with respect to a specific matrix, analyte and level of concentration. The accepted reference value is usually established by repeatedly measuring a NIST (National Institute of Standards and Technology) or a JRC-IRMM (Joint Research Centre, Institute for Reference Materials and Measurements) or others traceable standard (eventually certified) reference materials (SRMs). Nevertheless, various experimental scenarios induce further necessities to clarify the concept and its operational achievement. In fact, when dealing with a method-dependent result (as for an operationally defined measurand), a reliable reference value (from a proper certified reference material, CRM) is often unavailable. Let us start to develop specific reflections for each quality parameter.

The source of inconsistencies

A brief history of the parameters that discussed and defined by international scientific committees might help to understand the inconsistencies being examined. Before 1994, only accuracy—related to systematic error—and precision—related to random error—were used to describe the quality of a measurement result. A shift in the meaning of these terms appeared with the publication of the ISO 5725-1:1994 series of standards—“Accuracy (trueness and precision) of measurement methods and results” [4]—in which was introduced the trueness and a relationship was established between accuracy, trueness and precision. According to VIM, items 2.13 (Note 1) [3], “The concept “measurement accuracy” is not a quantity and is not given a numerical quantity value. A measurement is said to be more accurate when it offers a smaller measurement error,” then such a quantitative relationship, remarked in the paper of Menditto et al. [2] (Fig. 1: Relationships between type of error, qualitative performance characteristics and their quantitative expression), does not exist. As a consequence, the link between the accuracy and the measurement uncertainty must be critically re-examined. This fact will be further discussed, but, prior to examine in detail the state of the art, we propose to subdivide the scientific route of a measurement method into three steps, since this will be used for rationalization of concepts and related quantities:

  • Design level: The measurement method is designed by a scientist (or by a team) and technically optimized, usually with respect to a specific matrix and level of concentration; this step is chemically centered at a research level of action;

  • Validation level: The measurement method is tested (according to a specific protocol which includes various repetitions of measurement and various specific experimental designs for each validation parameter selected) as to its quality performances to define (under statistical control and having established a level of probability, or of risk) qualitative and quantitative margins of proper application; this step is quality centered at a metrological level of action;

  • Application level: The measurement method is daily used by a chemical laboratory which employs in its routine to provide results, usually obtained by way of a single measurement; this step is customer/business centered at a social level of action.

At each of the last two levels, one can find specific modus operandi and, then, those quality parameters strictly coherent with the predefined goals.

Precision

Precision indicates how close independent measurement results obtained by replicated measurements are to one another and is usually quantitatively expressed by way of the standard deviation, which describes the spread of results obtained under a specific measurement protocol. Both precision and trueness are defined starting by the mean value of repeated measurements; hence, they can be evaluated only executing a wide number of measurements under a specific experimental design typical of a method validation procedure. Moreover, precision can be correctly evaluated only under strictly defined measurement conditions (so, repeatability, intermediate precision and reproducibility arise). Precision is the quality parameter better defined by current standards, and we believe it does not require further discussion, at least at a general level of thinking at which this paper aims to contribute.

Trueness

The VIM definition of trueness is uncomfortable, since it involves a limit concept intractable to the finite (2.14: “infinite number of replicate measured quantity values”) [3]. Moreover, such an absolute and pure definition should match with a true value and not with “a reference quantity value” (by 2.14 in [3]). The definitions of both DIN [8] and ISO [4] are, on the contrary, manageable, since they simply (and realistically) consider large series of measurement results (a wide number of measurement replicates).

In 2.14, Note 1 of Ref. [3], one can read: “Measurement trueness is not a quantity and thus cannot be expressed numerically”: The estimate for the closeness of agreement is, in fact, the bias (or the measurement bias); JCGM [3] refers to ISO 5725 standard for the numerical expression of the systematic error (2.14 and 2.18 of VIM in [3]).

Trueness can be assessed by using the difference (X M −X R) between the measured value (as average value, X M) and a reference one (X R). When assessing the trueness, a significant difficult is often finding the reference values to compare measurements with. Two basic techniques are available to evaluate the trueness: Checking against a reference values for a specific material (matrix) or from a result obtained by a well-described measurement procedure. If no suitable standard material exists, other techniques must be employed [9]. Trueness is considered to be the closeness of agreement between the average value obtained from a large series of measurement results and a reference accepted one. The terminology is very similar to that used for accuracy, nevertheless:

  • trueness applies to the average value of a large number of measurement results, while

  • accuracy applies to a single result of measurement.

Bias (the number associated with the trueness, the estimate of a systematic measurement error, item 2.18 of VIM in [3]) is the difference between the average value of the large series of measurement results and the reference accepted. Bias is, then, the quantity associated with the trueness. Bias, distinguished in laboratory and method bias, is equivalent to the total systematic error in the measurement, and a correction to negate the systematic error can be made by adjusting for the bias. ISO 5725-1 standard also avoids the use of the term bias, because it has different connotations outside the fields of science and engineering, as in medicine and law.

Finally, we want state that a measurement procedure can be labeled with a binary judge yes/no as to the trueness (confirming its qualitative nature). This goal can be reached applying a simple t test (considering a Gaussian population of data) of comparison between a measured (averaged) and a reference value.

Accuracy and measurement uncertainty

Especially for accuracy, rough and conflicting definitions can be revealed in the literature: an inspection is reported, as an example, in the paper of medical area in Ref. [10]. Accuracy and trueness are both terms related to the deviation from a reference value.

According to ISO 5725 guide [4], the term accuracy is used to refer to both trueness and precision. Moreover, according to ISO 3534-1:2006 guide [5], “An increasingly common expression of accuracy is the so called measurement uncertainty, which provides a single figure expression of accuracy.” Also according to DIN 55350-13:1987-07 [8] standard, both trueness and precision are explicitly invoked to define the accuracy of a measurement result; hence, accuracy would be linked to a quantity related to the total measurement error (both systematic and random). According to the DIN 55350-13:1987-07 standard [8] or to Eurachem document [11], the systematic component of the measurement error is related to the trueness while the random one to the precision. As to NMKL (Nordic Committee on Food Analysis) [12], accuracy is ignored while only trueness and precision are considered as validation parameters of a measurement method.

According to the GUM [13], measurement uncertainty estimate takes account of all recognized experimental effects operating on the measurement result; the uncertainties associated with each effect identified are combined according to established procedures (various models of calculations are available to construct an uncertainty budget). Measurement uncertainty leads to a parameter characterizing the dispersion of the values attributed to a measured quantity due to both type A and type B uncertainties. It is noteworthy that a measurement uncertainty is expressed in terms of standard deviation (including a coverage factor of statistical meaning leading to a confidence interval): In our opinion, this is in contrast to ISO [4, 5] definition of accuracy and to the definition of measurement uncertainty [3]. Clarity on the topic can arise invoking the useful distinction between the Error approach (systematic and random errors) and the Uncertainty approach (type A and type B uncertainties) to the evaluation of measurement quality [13]. In fact, it is incoherent that the same quantity, the accuracy, is defined, including and combining:

  • a single deviation from a reference value (a simple difference), as for trueness, and

  • a standard deviation from an average value (a dispersion index), as for precision, or

  • expressed as dispersion of the values attributed to a measured quantity due to both type A and type B uncertainties.

This concept is also sustained in the papers of Hubert et al. [1417], in which authors wrote “In fact, one cannot measure in only one parameter, difference compared to a reference value and a dispersion of the results.” Some conceptual aspects are still controversial and deserve further discussion of trueness, precision and accuracy together and attempt of clarification and agreement between theory (definitions) and practices (numerical quantities derived from theoretical definitions).

Discussion

In spite of the apparent simplicity, trueness, precision and accuracy are not yet uniquely defined in one document for the whole world. As fully reported above, precision is the quality parameter better defined by current standards and we believe it does not require further discussion, at least at a general level of thinking at which this paper aims to contribute. On the other hand, trueness, defined as “Closeness of agreement between the average of an infinite number of replicate measured quantity values and a reference quantity value” (VIM item 2.14 in [3]) represents an inconsistent definition. As a proposal, trueness could be considered only an idealized concept and defined as “Closeness of agreement between the average of an infinite number of replicate measured quantity values and a true value of a measurand.” Note the matching between the unrealizable “infinite number of replicate measured quantity” and the nonexistent (unknowable, VIM item 2.11 in [3]) “true quantity value” (another idealized concept). Then, the corresponding real concept could be better explained by way of a new term, i.e., exactness, defined as “Closeness of agreement between the average of a large number of replicate measured quantity values and a reference quantity value of a measurand.” Note the matching between the realizable “average of a large number of replicate measured quantity” and the often (not always) existent “reference quantity value” (see the definition of trueness by the ISO 5725-1:1994 standard [4], in this, our proposal adopted to define the exactness). GUM [13] in Annex G, item G.1.6 uses the term exactness writing “a great deal of exactness.” Hence, “exact” is not opposed to “wrong” as in mathematics (contextualization of language) and can be properly used in the context of metrology to manage, without ambiguity, the experimental variability (gradual scale of judgement with respect to a recognized reference) not existing in mathematics (absolute binary judgement exact/wrong). Then, a result of measurement can be exact (or not), according to a pre-established degree of agreement with an accepted reference value. Finally, exactness refers to a degree of matching of a measurement result (validation field) with a reference value adopted for a certain measurand.

Regarding to accuracy, the statement by DIN [8] “The term accuracy consists of two criteria, the precision and the trueness” (also reflected by ISO standard in [4]) is, in our opinion, wrong and misleading. Moreover, distinguishing trueness (exactness, in our proposal), accuracy and precision only on the basis of the type of error, random and systematic, is neither sufficient nor adequate, and in fact, a combination of systematic and random errors can be done only in terms of the mean squared error on a variable. The mean squared error is the sum of the squared bias and the observed variance, and this formulation is of only theoretical interest since, of course, the true value is unknown.

Table 1 collects a synthesis of approaches to the evaluation of measurement quality and related qualitative and quantitative indicators. Those based on error or uncertainty are different approaches in assessing the quality of measurements [13]; hence, since:

Table 1 Approaches to the evaluation of measurement quality and related qualitative and quantitative indicators
  • the uncertainties are the effects of both random and systematic errors,

  • the uncertainties are derived by a function y = f(x) typical of the measurement model adopted,

  • “There is not always a simple correspondence between the classification into categories A or B and the previously used classification into “random” and “systematic” uncertainties,” (“Introduction”, item 0.7 of ref. [13])

they must never be confused or mixed neither in the conceptualization nor in the calculation.

Joining concepts of exactness, precision and accuracy (error approach) in a unique picture, we have that:

  • Exactness deals with the quality of a large number of measurement results—repetitions required—established during a validation process of a measurement method with respect to the systematic error of measurement;

  • Precision deals with the quality of a large number of measurement results—repetitions required—established during a validation process of a measurement method with respect to the random error of measurement;

  • Accuracy deals with the quality of a single result of measurement—repetitions unrequired—obtained in routine conditions of the measurement method application.

In this connection, and against the statement of both ISO 5725-1:1994 series of standards [4] and DIN 55350-13:1987-07 norm [8], it seems us incorrect to interpret the accuracy of a single result of measurement as composed by trueness and precision, since these three attributes refer to different fields of application of a measurement method. In addition, by NIST TN 1297 (sec. D.1.1.1), we have: «“Accuracy” is a qualitative concept. Because “accuracy” is a qualitative concept, one should not use it quantitatively, that is, associate numbers with it; numbers should be associated with measures of uncertainty instead» [18]. Qualitatively speaking, to obtain low value of measurement uncertainty, both type A and type B uncertainties have to be reduced but this does not mean that measurement error is small. On the other hand, “accurate” derives from the Latin term “accuratus,” which means “made with care,” and “care” is a typical qualitative concept. Hence, definitions of accuracy as those expressed in terms of closeness of agreement by VIM, ISO and other scientific institutions can be eliminated, since they are free of practical address. Furthermore, as in NIST TN 1297 [18], also in ISO 3534-1:2006 guide [5], we can read “An increasingly common expression of accuracy is the so called measurement uncertainty, which provides a single figure expression of accuracy.” Also this affirmation, in our opinion and according to Huber et al. [1417], is wrong and misleading. A dispersion index (as the measurement uncertainty is) cannot result as sum of a range (accuracy, 2.13 of VIM in [3]) and of a standard deviation (precision, a dispersion index, Note 1 in 2.15 of VIM in [3]): This is a mathematical shame. Simply, measurement uncertainty (“Non-negative parameter characterizing the dispersion of the quantity values being attributed to a measurand, based on the information used,” item 2.26 of VIM in [3]) represents a tool to evaluate the range of variability of a single measurement result with respect to the two types of uncertainty contributions currently defined and accepted by the scientific community, namely:

  • that detected by repeated measurement results under defined conditions: type A uncertainty (evaluated by statistical means),

  • that detected evaluating intrinsic characteristics of instruments and procedures employed: type B uncertainty (evaluated by other means with respect to the statistical ones),

From GUM [13], Annex D, item D.5.1.: “Thus the uncertainty of a result of a measurement is not necessarily an indication of the likelihood that the measurement result is near the value of the measurand. It is simply an estimate of the likelihood of nearness to the best value that is consistent with presently available knowledge.” The measurement uncertainty can be evaluated—with a complex procedure based on different model approaches—even if an accepted reference value is actually unavailable and a single deviation is, hence, unachievable.

Conclusions and proposals

With the goal to reach rational, sharp and unambiguous definition of trueness, precision and accuracy, without recourse to specific notes and footnotes to clarify ambiguous texts based on fuzzy concepts, we propose to abandon, or to redefine, the term trueness. A measurement result can be more or less exact with respect to a reference value; therefore, exactness might be an effective term, while trueness is simply a term corresponding to an ideal concept. We state that a measurement result unbiased and precise (according to pre-definite requirements and under stipulated conditions) is accurate (or, made with care), but the accuracy cannot be mixed with trueness and precision neither at a conceptual nor at an operational (quantitative) level, as also sustained by NIST TN 1297 (sec. D.1.1.1) [18] comment and also reflected in the IUPAC Gold Book (“Accuracy is a qualitative concept”) [19] and in the VIM 2012 (“measurement accuracy is not a quantity”) [3]—leading to intend accuracy only in a qualitative fashion, without any quantity linked to. Besides trueness (exactness, in our proposal) and precision, metrological and analytical communities can use the measurement uncertainty—a composite standard deviation (including a variety of contributions combined)—as quality indicator of uncertainties of type A and B (uncertainty approach to measurement quality) related to single results of measurement as output of routine activity of a laboratory.

Starting by the mentioned considerations and definitions of international organisms, and trying a rational synthesis, we collect this list of concepts:

  1. 1.

    we propose to intend the current term trueness only in an idealized meaning, defining it as “Closeness of agreement between the average of an infinite number of replicate measured quantity values and a true quantity value of a measurand,” trueness is related to a nonexistent true value and could be abandoned as term in metrology or employed specifying its theoretical meaning;

  2. 2.

    we also propose to introduce the term exactness, defined as “Closeness of agreement between the average of a large number of replicate measured quantity values and a reference quantity value of a measurand,” correctly and lucidly describing the matching between a measurement result—calculated from a large number of test values (under specified experimental conditions)—and an accepted (conventional) reference quantity value;

  3. 3.

    measurement bias (or bias), the estimate of the systematic measurement error, will be used as term indicating the quantitative expression of exactness;

  4. 4.

    either exactness (the qualitative term just proposed) or precision (qualitative term) refers to the quality of a large number of measurement results and cannot be mixed with quantities used to describe the quality of a single result of measurement (namely, accuracy); hence, “Accuracy (trueness and precision) of measurement methods and results” by ISO 5725-1:1994 series of standards [4] is a wrong and misleading definition (a title of a guide) and must be abandoned;

  5. 5.

    we believe that the term accuracy is finally eliminable in a quantitative meaning, while it can be properly used only according to a qualitative meaning, avoiding to associate numbers with it; then, we propose to adopt exclusively a qualitative meaning of accuracy whose use is circumscribed to measurement results quality assessment (large number of measurement results required, and the quality of single results is therefore excluded); hence, accurate is only an adjective used to indicate the quality of an unbiased and precise measurement result (under defined scenarios of measurements); in this point of view, the VIM definition of accuracy (item 2.13) as “Closeness of agreement between a measured quantity value and a true quantity value of a measurand” [3] must be abandoned, since “a closeness of agreement” brings to a calculation to obtain a numerical quantity value, which is out of sense and it is also incoherent with the text in Note 1 underlining the qualitative nature of accuracy;

  6. 6.

    the quality of a single result of measurement can be assessed and expressed by way of the measurement uncertainty (quantitative term) that can be quantified by constructing the uncertainty budget according to a specific calculation model adopted (uncertainty approach) to coherently combine type A and type B uncertainties properly expressed as standard deviations; no relationships between accuracy and measurement uncertainty can be sustained according to this scheme of terminology.

Table 2 synthesizes into an organic and comprehensive scheme our view regarding quantities implied in the error approach to the measurement quality evaluation.

Table 2 Relationships between type of measurement error, performance attribute of a measurement method (validation activity) and their numerical expression

Probably, the next metrological documents with a role of vocabulary will be asked to identify a series of idealized concepts with the corresponding meaning attributed to and, then, to provide the meaning of the corresponding real concepts mirroring the ideal ones. This will clarify the general landscape of the measurement science and will help scientists to distinguish what is practically addressed from what is only resulted of a limit mental model.