Keywords

1 Introduction

Information quality (IQ) evaluation is of major importance for information processing and for helping the decision-making under uncertainty. In [1], the authors introduced the Accessibility, Interpretability, Relevance, and Integrity concepts as main attributes to describe the information quality in the context of assurance and belief networks, but unfortunately they present only general concepts without explicit formulas to evaluate quantitatively these attributes. In several recent books devoted to IQ [2,3,4,5], the authors proposed different models and methods of IQ evaluations. Recently in [6], Bouhamed et al. proposed a quantitative IQ evaluation using the possibility theory framework, which could be extended to the belief functions theory framework with further investigations. In this latter work, the information quantity component being necessary for the IQ evaluation is based on Gini’s entropy rather than classical Shannon entropy. From the examination of these aforementioned references (and some references therein), it is far from obvious to make a clear justified choice among all these methods, especially when we model the uncertain information by belief functions (BF). What is clear however is that several distinct factors (or components) must be taken into account in the IQ evaluation mechanism. In this paper we focus on one of these components which is the Information Content (IC) component that we consider as the very (if not the most) essential component for IQ evaluation and indispensable for developing an effective IQ evaluation method in future research works.

It is worth noting that we do not address directly the whole IQ evaluation problem in this work but to provide a mathematical solution for measuring the IC of any Basic Belief Assignments (BBA) in the belief functions (BF) framework. Our new IC measure is interpreted as the dual of an effective Measure of Uncertainty (MoU) developed recently [7]. We show how to calculate the IC of a BBA, and we also discuss the notion of information gain and information loss in the BF context. In our opinion, we cannot define a measure of Information Content independently of a Measure of Uncertainty (MoU) because they must be strongly related to each other. Actually these measures are two different sides of a same abstract coin we would say. On one side (the uncertainty side), more uncertainty content we have harder is the decision or choice to make, and on the other side (the information side) more information content we have easier and stronger is the decision or choice to make. This very simple and natural basic principle will be clarified mathematically next. So, the measure of information content of a BBA must reflect somehow the easiness and strength in the choice of an element of the frame of discernment drawn from the BBA (i.e. in the decision-making). This paper is organized as follows. After a brief recall of basics of belief functions in Sect. 2, we recall the effective MoU adopted in this work in Sect. 3. Section 4 defines the measure of information content of a BBA and the information granules vector. Section 5 introduces the notions of information gain and information loss. Conclusions and perspectives appear in the last section.

2 Belief Functions

The belief functions (BF) were introduced by Shafer [8] for modeling epistemic uncertainty, reasoning about uncertainty and combining distinct sources of evidence. The answer of the problem under concern is assumed to belong to a known finite discrete frame of discernement (FoD) \(\varTheta =\{\theta _1,\ldots ,\theta _N\}\) where all elements (i.e. members) of \(\varTheta \) are exhaustive and mutually exclusive. The set of all subsets of \(\varTheta \) (including empty set \(\emptyset \), and \(\varTheta \)) is the power-set of \(\varTheta \) denoted by \(2^\varTheta \). The number of elements (i.e. the cardinality) of the power-set is \(2^{|\varTheta |}\). A (normalized) basic belief assignment (BBA) associated with a given source of evidence is a mapping \(m^\varTheta (\cdot ):2^\varTheta \rightarrow [0,1]\) such that \(m^\varTheta (\emptyset )=0\) and \(\sum _{X\in 2^\varTheta } m^\varTheta (X) = 1\). A BBA \(m^\varTheta (\cdot )\) characterizes a source of evidence related with a FoD \(\varTheta \). For notation shorthand, we can omit the superscript \(\varTheta \) in \(m^\varTheta (\cdot )\) notation if there is no ambiguity on the FoD we work withFootnote 1. The quantity m(X) is called the mass of belief for X. The element \(X\in 2^\varTheta \) is called a focal element (FE) of \(m(\cdot )\) if \(m(X)>0\). The set of all focal elements of \(m(\cdot )\) is denotedFootnote 2 by \(\mathcal {F}_{\varTheta }(m)\triangleq \{X\in 2^\varTheta | m(X)>0 \}\).

The belief and the plausibility of X are defined for any \(X\in 2^\varTheta \) by [8]

$$\begin{aligned} Bel(X) = \sum _{Y\in 2^\varTheta | Y\subseteq X} m(Y) \end{aligned}$$
(1)
$$\begin{aligned} Pl(X) = \sum _{Y\in 2^\varTheta | X\cap Y\ne \emptyset } m(Y)=1-\text {Bel}(\bar{X}). \end{aligned}$$
(2)

where \({\bar{X}\triangleq \varTheta \setminus \{X\}}\) is the complement of X in \(\varTheta \).

One has always \({0\le Bel(X)\le Pl(X) \le 1}\), see [8]. For \({X=\emptyset }\), \({Bel(\emptyset )=0}\) and \({Pl(\emptyset )=0}\), and for \({X=\varTheta }\) one has \({Bel(\varTheta )=1}\) and \({Pl(\varTheta )=1}\). Bel(X) and Pl(X) are often interpreted as the lower and upper bounds of unknown probability P(X) of X, that is \({Bel(X) \le P(X) \le Pl(X)}\). To quantify the uncertainty (i.e. the imprecision) of \({P(X)\in [Bel(X), Pl(X)]}\), we use the notation \(u(X)\in [0,1]\) defined by

$$\begin{aligned} u(X)\triangleq Pl(X)-Bel(X) \end{aligned}$$
(3)

The quantity \(u(X)=0\) if \({Bel(X)=Pl(X)}\) which means that P(X) is known precisely, and one has \(P(X)=Bel(X)=Pl(X)\). One has \({u(\emptyset )=0}\) because \({Bel(\emptyset )=Pl(\emptyset )=0}\), and one has \({u(\varTheta )=0}\) because \({Bel(\varTheta )=Pl(\varTheta )=1}\). If all focal elements of \(m(\cdot )\) are singletons of \(2^\varTheta \) the BBA \(m(\cdot )\) is a Bayesian BBA because \({\forall X\in 2^\varTheta }\) one has \({Bel(X)=Pl(X)=P(X)}\) and \(u(X)=0\). Hence the belief and plausibility of X coincide with a probability measure P(X) defined on the FoD \(\varTheta \). The vacuous BBA characterizing a totally ignorant source of evidence is defined by \({m_v(X)=1}\) for \({X=\varTheta }\), and \({m_v(X)=0}\) for all \({X\in 2^\varTheta }\) different from \(\varTheta \). This particular BBA has played a major role in the establishment of a new effective measure of uncertainty of BBA defined in [7].

3 Generalized Entropy of a BBA

In [9] we did analyze in details forty-eight measures of uncertainty (MoU) of BBAs by covering 40 years of research works on this topic. Some of these MoUs capture only a particular aspect of the uncertainty inherent to a BBA (typically, the non-specificity and the conflict). Other MoUs propose a total uncertainty measure to capture jointly several aspects of the uncertainty. Unfortunately, most of these MoUs fail to satisfy four very simple reasonable and essential desiderata, and so they cannot be considered as really effective and useful. Actually only six MoUs can be considered as effective from the mathematical sense presented next, but unfortunately they appear as conceptually defective and disputable, see discussions in [9]. That is why, a better effective measure of uncertainty (MoU), i.e. generalized entropy of BBAs has been developed and presented in [7]. The mathematical definition of this new effective entropy is given by

$$\begin{aligned} U(m) =\sum _{X\in 2^\varTheta } s(X) \end{aligned}$$
(4)

with

$$\begin{aligned} s(X)\triangleq - m(X)(1-u(X))\log (m(X)) + u(X)(1-m(X)) \end{aligned}$$
(5)

The quantity \({-(1-u(X))\log (m(X))=(1-u(X))\log (1/m(X))}\) entering in s(X) in (5) is the surprisalFootnote 3 \(\log (1/m(X))\) of X discounted by the confidence \((1-u(X))\) one has on the precision of P(X). The term \({-m(X)(1-u(X))\log (m(X))}\) is the weighted discounted surprisal of X. The term \({u(X)(1-m(X))}\) entering in (5) corresponds to the imprecision of P(X) discounted by \((1-m(X))\) because the greater m(X) the less one should take into account the imprecision u(X) in the MoU. The quantity s(X) is the uncertainty contribution related to element X (named the entropiece of X) in the MoU U(m). This entropiece s(X) involves m(X) and the imprecision \(u(X)=Pl(X)-Bel(X)\) about the unknown probability of X in a subtle interwoven manner. The cardinality of X is indirectly taken into account in the derivation of s(X) thanks to u(X) which requires the derivation of Pl(X) and Bel(X) functions depending on the cardinality of X. Because \(u(X)\in [0,1]\) and \(m(X)\in [0,1]\) one has \(s(X)\ge 0\), and \(U(m)\ge 0\). The quantity U(m) is expressed in nats because we use the natural logarithm. U(m) can be expressed in bits by dividing the U(m) value in nats by \(\log (2)=0.69314718...\). This measure of uncertainty U(m) is a continuous function in its basic belief mass arguments because it is a summation of continuous functions. In formula (5), we always take \(m(X)\log (m(X))=0\) when \(m(X)=0\) because \(\lim _{m(X)\rightarrow 0^+} m(X)\log (m(X))=0\) which can be proved using L’Hôpital rule [11]. Note that for any BBA m, one has always \(s(\emptyset )=0\) because \(m(\emptyset )=0\) and \(u(\emptyset )=Pl(\emptyset )-Bel(\emptyset )=0-0=0\). For the vacuous BBA, one has \(s(\varTheta )=0\) because \(m_v(\varTheta )=1\) and \(u(\varTheta )= Pl(\varTheta ) -Bel(\varTheta )=1-1=0\).

The set \(\{s(X),X\in 2^\varTheta \}\) of the entropieces values s(X) can be represented by an entropiece vector \(\textbf{s}(m^\varTheta )=[s(X), X\in 2^\varTheta ]^T\), where any order of elements X of the power set \(2^\varTheta \) can be chosen. For simplicity, we suggest to use the classical N-bits representation (if \(|\varTheta |=N\)) with the increasing order - see the next example.

This measure of uncertainty U(m) is effective because it can be proved (see proofs in [7]) that it satisfies the following four essential properties:

  1. 1.

    \(U(m) =0\) for any BBA \(m(\cdot )\) focused on a singleton X of \(2^\varTheta \).

  2. 2.

    \(U(m^\varTheta _v) < U(m^{\varTheta '}_v)\) if \(|\varTheta | < |\varTheta '|\).

  3. 3.

    \(U(m)= - \sum _{X\in \varTheta } m(X)\log (m(X))\) if the BBA \(m(\cdot )\) is a Bayesian BBA. Hence, U(m) reduces to Shannon entropy [12] in this case.

  4. 4.

    \(U(m)< U(m_v)\) for any non-vacuous BBA \(m(\cdot )\) and for the vacuous BBA \(m_v(\cdot )\) defined with respect to the same FoD.

The proof of the three first properties is quite simple to make. The proof of the last property is much more difficult. As explained in [7], we do not consider that the sub-additivity property [13] of U(m) is a fundamental desideratum that an effective MoU must satisfy in general. In fact the sub-additivity desideratum is incompatible with the fourth important property \(U(m)< U(m_v)\) above which stipulates that none non-vacuous BBA can be more uncertain (i.e. more ignorant about the problem under consideration) than the vacuous BBA. Actually, it does not make sense to have the entropy \(U(m_v^{\varTheta \times \varTheta '})\) of the vacuous joint BBA \(m_v^{\varTheta \times \varTheta '}\) defined on the cartesian product space \(\varTheta \times \varTheta '\) smaller than (or equal to) the sum \(U(m_v^{\varTheta })+U(m_v^{\varTheta '})\) of entropies of vacuous BBAs \(m_v^{\varTheta }\) and \(m_v^{\varTheta '}\) defined respectively on \(\varTheta \) and \(\varTheta '\). There is no theoretical justification, nor intuitive reason for this sub-additivity desideratum in the context of non-bayesian BBAs. Of course for Bayesian BBAs, U(m) is equivalent to Shannon entropy which is in this case sub-additive.

It can be also proved, see [7] for details, that the entropy of the vacuous BBA \(m_v\) related to a FoD \(\varTheta \) is equal to

$$\begin{aligned} U(m_v^\varTheta )=2^{|\varTheta |}-2 \end{aligned}$$
(6)

This maximum entropy value \(U(m_v)\) makes perfect sense because for this very particular BBA there is no information at all about the conflicts between the elements of the FoD. Actually for all \({X\in 2^\varTheta \setminus \{\emptyset ,\varTheta \}}\) one has \({u(X)=1}\) because \({[Bel(X),Pl(X)]=[0,1]}\), and one has \(u(\emptyset )=0\) and \(u(\varTheta )=0\). Hence, the sum of all imprecisions of P(X) for all \({X\in 2^\varTheta }\) is exactly equal to \({2^{|\varTheta |} -2}\) which corresponds to \(U(m_v^\varTheta )\) as expected. Moreover, one has always \({U(m_v^\varTheta )> \log (|\varTheta |)}\) which means that the vacuous BBA has always an entropy greater than the maximum of Shannon entropy \({\log (|\varTheta |)}\) obtained with the uniform probability mass function distributed on \(\varTheta \).

Example 1 of Entropy Calculation: consider \(\varTheta =\{\theta _1,\theta _2\}\) and the BBA \(m^\varTheta (\theta _1)=0.5\), \(m^\varTheta (\theta _2)=0.3\) and \(m^\varTheta (\theta _1\cup \theta _2)=0.2\), then one has \([Bel(\emptyset ),Pl(\emptyset )]=[0,1]\) and \(u(\emptyset )=0\), \([Bel(\theta _1),Pl(\theta _1)]=[0.5,0.7]\), \([Bel(\theta _2),Pl(\theta _2)]=[0.3,0.5]\), and \([Bel(\varTheta ),Pl(\varTheta )]=[1,1]\). Hence, \(u(\theta _1)=0.2\), \(u(\theta _2)=0.2\) and \(u(\varTheta )=0\). Applying (5), one gets \(s(\emptyset )=0\), \(s(\theta _1)\approx 0.377258\), \(s(\theta _2)\approx 0.428953\) and \(s(\varTheta )\approx 0.321887\). Using the 2-bits representation with increasing orderingFootnote 4, we encode the elements of the power set as \(\emptyset =00\), \(\theta _1=01\), \(\theta _2=10\) and \(\theta _1\cup \theta _2=11\). The entropiece vector for this simple example is

$$\begin{aligned} \textbf{s}(m^\varTheta )= \begin{bmatrix} s(\emptyset )\\ s(\theta _1)\\ s(\theta _2)\\ s(\theta _1\cup \theta _2) \end{bmatrix} \approx \begin{bmatrix} 0\\ 0.377258\\ 0.428953\\ 0.321887 \end{bmatrix} \end{aligned}$$
(7)

If we use the classical N-bits (here \(N=2\)) representation with increasing ordering (as we recommand) the first component of entropiece vector \(\textbf{s}(m^\varTheta )\) will be \(s(\emptyset )\) which is always equal to zero for any BBA m, hence the first component of \(\textbf{s}(m^\varTheta )\) is always zero. By summing all the components of the entropiece vector \(\textbf{s}(m^\varTheta )\) we obtain the entropy \(U(m^\varTheta )\approx 1.128098\) nats of the BBA \(m^\varTheta (\cdot )\). Note that the components s(X) (for \(X\ne \emptyset \)) of the entropieces vector \(\textbf{s}(m^\varTheta )\) are not independent because they are linked to each other through the calculation of Bel(X) and Pl(X) values entering in u(X).

Example 2 of Entropy Calculation: for the vacuous BBA \(m_v^\varTheta \), and when using the binary increasing encoding of elements of \(2^\varTheta \), the first component \(s(\emptyset )\) and the last component \(s(\varTheta )\) of entropiece vector \(\textbf{s}(m_v^\varTheta )\) will always be equal to zero, and all other components of \(\textbf{s}(m_v^\varTheta )\) will be equal to one. For instance, if we consider \(\varTheta =\{\theta _1,\theta _2\}\) and the vacuous BBA \(m_v^\varTheta (\theta _1)=0\), \(m_v^\varTheta (\theta _2)=0\) and \(m_v^\varTheta (\theta _1\cup \theta _2)=1\), the corresponding entropiece vector \(\textbf{s}(m_v^\varTheta )\) is

$$\begin{aligned} \textbf{s}(m_v^\varTheta )= \begin{bmatrix} s(\emptyset )\\ s(\theta _1)\\ s(\theta _2)\\ s(\theta _1\cup \theta _2) \end{bmatrix} = \begin{bmatrix} 0\\ 1\\ 1\\ 0 \end{bmatrix} \end{aligned}$$
(8)

By summing all the components of the entropiece vector \(\textbf{s}(m_v^\varTheta )\) we obtain the entropy value \(U(m_v^\varTheta )=2\) nats for this vacuous BBA \(m_v^\varTheta (\cdot )\), which is of course in agreement with the formula (6).

4 Information Content of a BBA

We consider a (non-empty) FoD of cardinality \(|\varTheta |=N\), and we model our state of knowledge about the problem under consideration by a BBA defined on \(2^\varTheta \). Without more knowledge than the FoD itself (and its cardinality N), we are totally ignorant about the solution of the problem we want to solve, and of course we have no clue for making a decision/choice among the elements of the FoD. The BBA reflecting this total ignorant situation is the vacuous BBA \(m_v(\cdot )\), whose maximal entropy is \(U(m_v)=2^N-2\). In such case, we naturally expect that the information content we haveFootnote 5 is zero when the uncertainty measure is maximal. In the very opposite case, it is very natural to consider that the information content of a BBA is maximal if the entropy value (the MoU value) of a BBA \(m(\cdot )\) is zero, meaning that we make a choice of one element of the FoD without hesitation. Based on these very simple ideas, we propose to define the information content of any BBA \(m(\cdot )\) as the dual of the effective measure of uncertainty, more precisely by

$$\begin{aligned} IC(m^\varTheta )\triangleq U(m_v^\varTheta )-U(m^\varTheta )=(2^{|\varTheta |}-2) - \sum _{X\in 2^\varTheta } s(X) \end{aligned}$$
(9)

where s(X) is the entropiece of the element \(X\in 2^\varTheta \) given by (5), that is

$$\begin{aligned} s(X)\triangleq - (1-u(X))m^\varTheta (X)\log (m^\varTheta (X)) + u(X)(1-m^\varTheta (X)) \end{aligned}$$

and where u(X) is the level of imprecision of the probability P(X) given by

$$\begin{aligned} u(X)=Pl^\varTheta (X)-Bel^\varTheta (X) = \sum _{Y\in 2^\varTheta | X\cap Y\ne \emptyset } m^\varTheta (Y) - \sum _{Y\in 2^\varTheta | Y\subseteq X} m^\varTheta (Y) \end{aligned}$$
(10)

From the definition (9), one sees that for \(m^\varTheta \ne m_v^\varTheta \) one has \(IC(m^\varTheta )> 0\) because \(U(m^\varTheta )< U(m_v^\varTheta )\), and for \(m^\varTheta =m_v^\varTheta \) one has \(IC(m_v^\varTheta )=0\), which is what we naturally expect.

It is worth mentioning that the information content \(IC(m^\varTheta )\) of a BBA depends not only of the BBA m(.) itself but also on the cardinality of the frame of discernmentFootnote 6 \(\varTheta \) because \(IC(m^\varTheta )\) requires the knowledge of \(|\varTheta |\) to calculate the max entropy value \(U(m_v^\varTheta )=2^{|\varTheta |}-2\) entering in (9). This remark is very important to understand that even if two BBAs (defined on different FoDs) focus entirely on a same focal element, their information contents are necessarily different. For instance, if we consider the Bayesian BBA with \(m^\varTheta (\theta _1)=1\) defined on the FoD \(\varTheta =\{\theta _1,\theta _2\}\), then

$$IC(m^\varTheta )=U(m_v^\varTheta )-U(m^\varTheta )=(2^{|\varTheta |}-2) - 0=2 \ \text {(nats)}$$

whereas if we consider the Bayesian BBA with \(m^{\varTheta '}(\theta _1)=1\) defined on the larger FoD \(\varTheta '=\{\theta _1,\theta _2,\theta _3\}\) (for instance), then

$$IC(m^{\varTheta '})=U(m_v^{\varTheta '})-U(m^{\varTheta '})=(2^{|\varTheta '|}-2) - 0=6 \ \text {(nats)}$$

So even if the decision \(\theta _1\) that we would make based either on \(m^\varTheta \) or on \(m^{\varTheta '}\) is the same, these decisions must not be considered actually with the same strength, and this is what reflects our information content measure.

From this very simple definition of information content, we can also define the Normalized Information Content (NIC) (if needed later in some applications), denoted by \(NIC(m^\varTheta )\) by normalizing \(IC(m^\varTheta )\) with respect to the maximal value of entropy \(U(m_v^\varTheta )\) as

$$\begin{aligned} NIC(m^\varTheta )\triangleq \frac{U(m_v^\varTheta )-U(m^\varTheta )}{U(m_v^\varTheta )}=1 - \frac{U(m^\varTheta )}{U(m_v^\varTheta )} \end{aligned}$$
(11)

Hence we will have \(NIC(m^\varTheta )\in [0,1]\) and \(NIC(m^\varTheta )=0\) for \(m=m_v\), and \(NIC(m^\varTheta )=1\) for \(U(m)=0\) which is obtained when \(m(\cdot )\) is entirely focused on a singleton \(\theta _i \in \varTheta \), that is \(m^\varTheta (\theta _i)=1\) for some \(i\in \{1,2,\ldots ,|\varTheta |\}\).

In fact, the (total) information content of a BBA \(IC(m^\varTheta )\) is the sum of all the information granules \(IG(X|m^\varTheta )\) of elements \(X \in 2^\varTheta \) carried by a BBA \(m^\varTheta \), that is

$$\begin{aligned} IC(m^\varTheta )=\sum _{X\in 2^\varTheta } IG(X|m^\varTheta ) \end{aligned}$$
(12)

where

$$\begin{aligned} IG(X|m^\varTheta )\triangleq {\left\{ \begin{array}{ll} 0, \text {if }X=\emptyset \\ -s(X), \text {if }X=\varTheta \\ 1-s(X) \text { otherwise} \end{array}\right. } \end{aligned}$$
(13)

We can define the information granules vectorFootnote 7 \(\textbf{IG}(m)=[IG(X|m^\varTheta ), X \in 2^\varTheta ]^T\) by

$$\begin{aligned} \textbf{IG}(m^\varTheta ) \triangleq \textbf{s}(m_v^\varTheta )- \textbf{s}(m^\varTheta ) \end{aligned}$$
(14)

One sees that the (total) information content \(IC(m^\varTheta )\) of a BBA \(m^\varTheta \) is just the sum of all components \(IG(X|m^\varTheta )\) of the information granules vector \(\textbf{IG}(m)\). The information granules vector \(\textbf{IG}(m)\) is interesting and useful because it helps to see the contribution of each element X in the whole measure of the information content \(IC(m^\varTheta )\) of a BBA \(m^\varTheta \).

Example 1 (continued): consider \(\varTheta =\{\theta _1,\theta _2\}\) and the BBA \(m^\varTheta (\theta _1)=0.5\), \(m^\varTheta (\theta _2)=0.3\) and \(m^\varTheta (\theta _1\cup \theta _2)=0.2\). The information granules vector \(\textbf{IG}(m^\varTheta ) \) is given by

$$\begin{aligned} \textbf{IG}(m^\varTheta ) = \textbf{s}(m_v^\varTheta )- \textbf{s}(m^\varTheta ) = \begin{bmatrix} 0\\ 1\\ 1\\ 0 \end{bmatrix} - \begin{bmatrix} 0\\ 0.377258\\ 0.428953\\ 0.321887 \end{bmatrix} = \begin{bmatrix} 0\\ 0.622742\\ 0.571047\\ -0.321887 \end{bmatrix} \end{aligned}$$
(15)

By summing all the components of the information granules vector \(\textbf{IG}(m^\varTheta )\) we obtain the (total) information content \(IC(m^\varTheta )=0.871902\) nats of the BBA \(m^\varTheta \), which can of course be calculated directly also as

$$IC(m^\varTheta )=U(m_v^\varTheta )-U(m^\varTheta )=2- 1.128098=0.871902$$

However, the information granules vector \(\textbf{IG}(m^\varTheta )\) is interesting to identify the contribution of each element X in the whole measure of the information content.

5 Information Gain and Information Loss

Once the IC measure is defined for a BBA, it is rather simple to define the information gain and information loss of a BBA with respect to another one, both defined on a same FoD \(\varTheta \). Suppose that we have a first BBA \(m_1^\varTheta \) and a second BBA \(m_2^\varTheta \), then we can calculate by formula (9) their respective information contents \(IC(m_1^\varTheta )\) and \(IC(m_2^\varTheta )\). The difference of information content measure of \(m_2^\varTheta \) with respect to \(m_1^\varTheta \) is defined byFootnote 8

$$\begin{aligned} \varDelta _{IC}(m_2|m_1)\triangleq IC(m_2^\varTheta )-IC(m_1^\varTheta ) \end{aligned}$$
(16)

If we replace \(IC(m_2^\varTheta )\) and \(IC(m_1^\varTheta )\) by their expressions according to (9), it comes

$$\begin{aligned} \varDelta _{IC}(m_2|m_1)= [U(m_v^\varTheta )-U(m_2^\varTheta )]-[U(m_v^\varTheta )-U(m_1^\varTheta )]=U(m_1^\varTheta )-U(m_2^\varTheta ) \end{aligned}$$
(17)

If \(\varDelta _{IC}(m_2|m_1)=0\), the BBAs \(m_1^\varTheta \) and \(m_2^\varTheta \) have same measure of information content. So, there is no gain and no loss in information content if one switches from \(m_1^\varTheta \) to \(m_2^\varTheta \) or vice versa. \(\varDelta _{IC}(m_2|m_1)=0\) does not mean that the decisions based on \(m_1^\varTheta \) and on \(m_2^\varTheta \) are the same. It does only means that the decision based on \(m_1^\varTheta \) must be as easy as the decision made based on \(m_2^\varTheta \). It means that they have the same informational strength. That’s it. If \(\varDelta _{IC}(m_2|m_1)>0\), one has \(IC(m_2^\varTheta )>IC(m_1^\varTheta )\), i.e. the BBA \(m_2^\varTheta \) is more informative than \(m_1^\varTheta \). In this case we get an information gain if one switches from \(m_1^\varTheta \) to \(m_2^\varTheta \), and by duality we get an uncertainty reduction by switching from \(m_1^\varTheta \) to \(m_2^\varTheta \). It means that it must be easier to make a decision based on \(m_2^\varTheta \) rather on \(m_1^\varTheta \). If \(\varDelta _{IC}(m_2|m_1)<0\), one has \(IC(m_2^\varTheta )<IC(m_1^\varTheta )\), i.e. the BBA \(m_2^\varTheta \) is less informative than \(m_1^\varTheta \). In this case we get an information loss if one switches from \(m_1^\varTheta \) to \(m_2^\varTheta \), and by duality we get an uncertainty raise by switching from \(m_1^\varTheta \) to \(m_2^\varTheta \). It means that it must be easier to make a decision based on \(m_1^\varTheta \) rather on \(m_2^\varTheta \).

As simple example, consider \(\varTheta =\{\theta _1,\theta _2,\theta _3\}\). For the vacuous BBA one has \(U(m_v^\varTheta )=2^3-2=6\) nats. Suppose at time \(k=1\) one has the BBA \(m_1^\varTheta (\theta _1\cup \theta _2)=0.2\), \(m_1^\varTheta (\theta _1\cup \theta _3)=0.3\), \(m_1^\varTheta (\theta _1\cup \theta _2\cup \theta _3)=0.5\), then \(U(m_1^\varTheta )\approx 5.1493\) nats, and \(IC(m_1^\varTheta )=U(m_v^\varTheta )-U(m_1^\varTheta )\approx 0.8507\) nats. Suppose that after some information processing (belief revision, or fusion, etc.) we come up with the BBA \(m_2^\varTheta \) at time \(k=2\) defined by \(m_2^\varTheta (\theta _1)=0.2\) and \(m_2^\varTheta (\theta _1\cup \theta _3)=0.8\), then \(U(m_2^\varTheta )\approx 0.5004\) nats and \(IC(m_2^\varTheta )=U(m_v^\varTheta )-U(m_2^\varTheta )\approx 5.4996\) nats. In this case, we get \(\varDelta _{IC}(m_2|m_1)=5.4996- 0.8507=4.6489\) which is positive. Hence we get an information gain by switching from \(m_1^\varTheta \) to \(m_2^\varTheta \) thanks to the information processing applied.

6 Conclusions

In this paper we have introduced a measure of information content (IC) for any basic belief assignment (BBA). This IC measure based on an effective measure of uncertainty of BBAs is quite simple to calculate, and it reflects somehow the informational strength and easiness ability to make a decision based on any belief mass function. We have also shown how it is possible to identify the contribution of each focal element of the BBA to this information content measure thanks to the information granule vector. This new IC measure is also interesting because it allows to well quantify the information loss or gain between two BBAs, and thus as perspectives we could use it to quantify precisely and compare the performances of information processing using belief functions (like fusion rules, belief conditioning, etc.). We hope that this new theoretical IC measure will open interesting tracks for forthcoming research works on reasoning about uncertainty with belief functions.