Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Epistemic logic and probability theory both provide formal accounts of information. Epistemic logic takes a qualitative perspective on information, and works with a modal operator \(K\). Formulas such as \(K\varphi \) can be interpreted as ‘the agent knows that \(\varphi \)’, ‘the agent believes that \(\varphi \)’, or, more generally speaking, ‘\(\varphi \) follows from the agent’s current information’. Probability theory, on the other hand, takes a quantitative perspective on information, and works with numerical probability functions \(P\). Formulas such as \(P(\varphi ) = k\) can be interpreted as ‘the probability of \(\varphi \) is \(k\)’. In the present context, probabilities will usually be interpreted subjectively, and can thus be taken to represent the agent’s degrees of belief or credences.

With respect to one and the same formula \(\varphi \), epistemic logic is able to distinguish between three epistemic attitudes: knowing its truth (\(K\varphi \)), knowing its falsity (\(K\lnot \varphi \)), and being ignorant about its truth value (\(\lnot K\varphi \wedge \lnot K\lnot \varphi \)). Probability theory, however, distinguishes infinitely many epistemic attitudes with respect to \(\varphi \), viz. assigning it probability \(k\) (\(P(\varphi ) = k\)), for every \(k \in [0,1]\). In this sense probability theory can be said to provide a much more fine-grained perspective on information.

While epistemic logic thus is a coarser account of information, it certainly has a wider scope. From its very origins in Hintikka’s [34], epistemic logic has not only been concerned with knowledge about ‘the world’, but also with knowledge about knowledge, i.e. with higher-order information. Typical discussions focus on principles such as positive introspection (\(K\varphi \rightarrow KK\varphi \)). On the other hand, probability theory rarely talks about principles involving higher-order probabilities, such as \(P(\varphi ) = 1 \rightarrow P(P(\varphi )=1)=1\).Footnote 1 This issue becomes even more pressing in multi-agent scenarios. Natural examples might involve an agent \(a\) not having any information about a proposition \(\varphi \), while being certain that another agent, \(b\), does have this information. In epistemic logic this is naturally formalized as

$$\begin{aligned} \lnot K_a\varphi \wedge \lnot K_a\lnot \varphi \wedge K_a(K_b\varphi \vee K_b\lnot \varphi ). \end{aligned}$$

A formalization in probability theory might look as follows:

$$\begin{aligned} P_a(\varphi ) = 0.5 \wedge P_a(P_b(\varphi )=1 \vee P_b(\varphi )=0) = 1. \end{aligned}$$

However, because this statement makes use of ‘nested’ probabilities, it is rarely used in standard treatments of probability theory.

An additional theme is that of dynamics, i.e. information change. The agents’ information is not eternally the same; rather, it should be changed in the light of new incoming information. Probability theory typically uses Bayesian updating to represent information change (but other, more complicated update mechanisms are available as well). Dynamic epistemic logic interprets new information as changing the epistemic model, and uses the new, updated model to represent the agents’ updated information states. Once again, the main difference is that dynamic epistemic logic takes (changes in) higher-order information into account, whereas probability theory does not.

For all these reasons, the project of probabilistic epistemic logic seems very interesting. Such systems inherit the fine-grained perspective on information from probability theory, and the representation of higher-order information from epistemic logic. Their dynamic versions provide a unified perspective on changes in first- and higher-order information. In other words, they can be thought of as incorporating the complementary perspectives of (dynamic) epistemic logic and probability theory, thus yielding richer and more detailed accounts of information and information flow.

The remainder of this chapter is organized as follows. Section 13.2 introduces the static framework of probabilistic epistemic logic, and discusses its intuitive interpretation and technical features. Section 13.3 focuses on a rather straightforward type of dynamics, namely public announcements. It describes a probabilistic version of the well-known system of public announcement logic, and compares public announcement and Bayesian conditionalization. In Sect. 13.4 a more general update mechanism is introduced. This is a probabilistic version of the ‘product update’ mechanism in dynamic epistemic logic. Section 13.5, finally, indicates some applications and potential avenues of further research for the systems discussed in this chapter.

2 Probabilistic Epistemic Logic

In this section we introduce the static framework of probabilistic epistemic logic, which will be ‘dynamified’ in Sects. 13.3 and 13.4. Section 13.2.1 discusses the models on which the logic is interpreted. Section 13.2.2 defines the formal language and its semantics. Finally, Sect. 13.2.3 provides a complete axiomatization.

2.1 Probabilistic Kripke Models

Consider a finite set \(I\) of agents, and a countably infinite set \(Prop\) of proposition letters. Throughout this chapter, these sets will be kept fixed, so they will often be left implicit.

Definition 13.1

A probabilistic Kripke frame is a tuple \(\mathbb {F} = \langle W,R_i,\mu _i\rangle _{i\in I}\), where \(W\) is a non-empty finite set of states, \(R_i \subseteq W\times W\) is agent \(i\)’s epistemic accessibility relation, and \(\mu _i:W \rightarrow (W \rightharpoonup [0,1])\) assigns to each state \(w \in W\) a partial function \(\mu _i(w):W\rightharpoonup [0,1]\), such that

$$\begin{aligned} \sum _{v\in \mathrm {dom}(\mu _i(w))}\mu _i(w)(v) = 1. \end{aligned}$$

Definition 13.2

A probabilistic Kripke model is a tuple \(\mathbb {M} = \langle \mathbb {F},V\rangle \), where \(\mathbb {F}\) is a probabilistic Kripke frame (with set of states \(W\)), and \(V:Prop\rightarrow \wp (W)\) is a valuation.

Note that in principle, no conditions are imposed on the agents’ epistemic accessibility relations. However, as is usually done in the literature on (probabilistic) dynamic epistemic logic, we will henceforth assume these relations to be equivalence relations (so that the corresponding knowledge operators satisfy the principles of the modal logic S5).

The function \(\mu _i(w)\) represents agent \(i\)’s probabilities (i.e. degrees of belief) at state \(w\). For example, \(\mu _i(w)(v) = k\) means that at state w, agent \(i\) assigns probability \(k\) to state \(v\) being the actual state. From a mathematical perspective, this is not the most general approach: one can also define a probability space \(\mathbb {P}_{i,w}\) for each agent \(i\) and state \(w\), and let \(\mu _i(w)\) assign probabilities to sets in a \(\sigma \)-algebra on \(\mathbb {P}_{i,w}\), rather than to individual states. In this way one can easily drop the requirement that frames and models have finitely many states. This approach is taken in [28] for static probabilistic epistemic logic, and extended to dynamic settings in [47]. However, because all the characteristic features of probabilistic (dynamic) epistemic logic already arise in the simpler approach, in this chapter we will stick to this simpler approach, and take \(\mu _i(w)\) to assign probabilities to individual states. These functions are additively extended from individual states to sets of states, by putting (for any set \(X\subseteq \mathrm {dom}(\mu _i(w))\)):

$$\begin{aligned} \mu _i(w)(X) := \sum _{x\in X}\mu _i(w)(x). \end{aligned}$$

A consequence of our simple approach is that all sets \(X \subseteq \mathrm {dom}(\mu _i(w))\) have a definite probability \(\mu _i(w)(X)\), whereas in the more general approach, sets \(X\) not belonging to the \(\sigma \)-algebra on \(\mathbb {P}_{i,w}\) are not assigned any definite probability at all. A similar distinction can be made at the level of individual states. Because \(\mu _i(w)\) is a partial function, states \(v \in W - \mathrm {dom}(\mu _i(w))\) are not assigned any definite probability at all. An even simpler approach involves putting \(\mu _i(w)(v) = 0\), rather than leaving it undefined. In this way, the function \(\mu _i(w)\) is total after all. From a mathematical perspective, these two approaches are equivalent. From an informal perspective, however, there is a clear difference: \(\mu _i(w)(v) = 0\) means that agent \(i\) is certain (at state \(w\)) that \(v\) is not the actual state, whereas \(\mu _i(w)(v)\) being undefined means that agent \(i\) has no opinion whatsoever (at state \(w\)) about \(v\) being the actual state. Again, because all the characteristic features of probabilistic (dynamic) epistemic logic already arise without this intuitive distinction, we will opt for the even simpler approach, and henceforth assume that all probability functions are total.

To summarize: the approach adopted in this chapter is the simplest one possible, in the sense that definite probabilities are assigned to ‘everything’: (i) to all sets (there is no \(\sigma \)-algebra to rule out some sets from having a definite probability), and (ii) to all states (the probability functions \(\mu _i(w)\) are total on their domain \(W\), so no states are ruled out from having a definite probability).

We finish this subsection by mentioning two typical properties of probabilistic Kripke frames.Footnote 2 In the next subsection we will show that these properties correspond to natural principles about the interaction between knowledge and probability.

Definition 13.3

Consider a probabilistic Kripke frame \(\mathbb {F}\) and an agent \(i\in I\). Then \(\mathbb {F}\) is said to be i-consistent iff for all states \(w,v\): if \((w,v) \notin R_i\) then \(\mu _i(w)(v) = 0\). Furthermore, \(\mathbb {F}\) is said to be i-uniform iff for all states \(w,v\): if \((w,v) \in R_i\) then \(\mu _i(w)=\mu _i(v)\).

2.2 Language and Semantics

The language \(\fancyscript{L}\) of (static) probabilistic epistemic logic is defined by means of the following Backus-Naur form:

$$\begin{aligned} \varphi \,\,\,::=\,\,\, p \,\,\,|\,\,\, \lnot \varphi \,\,\,|\,\,\, \varphi _1\wedge \varphi _2 \,\,\,|\,\,\, K_i\varphi \,\,\,|\,\,\, a_1P_i(\varphi ) + \cdots + a_nP_i(\varphi )\ge b \end{aligned}$$

—where \(p \in Prop, i\in I, 1 \le n < \omega \), and \(a_1,\ldots , a_n,b \in \mathbb {Q}\). We only allow rational numbers as values for \(a_1,\ldots ,a_n,b\) in order to keep the language countable. As usual, \(K_i\varphi \) means that agent \(i\) knows that \(\varphi \), or, more generally, that \(\varphi \) follows from agent \(i\)’s information. Its dual is defined as \(\hat{K}_i\varphi := \lnot K_i\lnot \varphi \), and means that \(\varphi \) is consistent with agent \(i\)’s information.

Formulas of the form \(a_1P_i(\varphi _1)+\cdots +a_nP_i(\varphi _n)\ge b\) are called i-probability formulas.Footnote 3 Note that mixed agent indices are not allowed; for example, \(P_a(p) + P_b(q) \ge b\) is not a well-formed formula. Intuitively, \(P_i(\varphi )\ge b\) means that agent \(i\) assigns probability at least \(b\) to \(\varphi \). We allow for linear combinations in \(i\)-probability formulas, because this additional expressivity is useful when looking for a complete axiomatization [28], and because it allows us to express comparative judgments such as ‘agent \(i\) considers \(\varphi \) to be at least twice as probable as \(\psi \)’: \(P_i(\varphi ) \ge 2P_i(\psi )\). This last formula is actually an abbreviation for \(P_i(\varphi ) - 2P_i(\psi )\ge 0\). In general, we introduce the following abbreviations:

\(\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) \ge b\)

    for    

\(a_1P_i(\varphi _1) +\cdots + a_nP_i(\varphi _n)\ge b\),

\(a_1 P_i(\varphi _1) \ge a_2 P_i(\varphi _2)\)

    for    

\(a_1 P_i(\varphi _1) + (- a_2) P_i(\varphi _2)\ge 0 \),

\(\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) \le b\)

    for    

\(\sum _{\ell =1}^n (-a_\ell ) P_i(\varphi _\ell ) \ge -b\),

\(\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) < b\)

    for    

\(\lnot (\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) \ge b)\),

\(\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) > b\)

    for    

\(\lnot (\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) \le b)\),

\(\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) = b\)

    for    

\(\sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) \ge b \wedge \sum _{\ell =1}^n a_\ell P_i(\varphi _\ell ) \le b\).

Note that because of its recursive definition, the language \(\fancyscript{L}\) can express the agents’ higher-order information of any sort: higher-order knowledge (for example \(K_aK_b\varphi \)), but also higher-order probabilities (for example \(P_a(P_b(\varphi ) \ge 0.5) = 1\)), and higher-order information that mixes knowledge and probabilities (for example, \(K_a(P_b(\varphi ) \ge 0.5)\) and \(P_a(K_b\varphi ) = 1\)).

The formal semantics for \(\fancyscript{L}\) is defined as follows. Consider an arbitrary probabilistic Kripke model \(\mathbb {M}\) (with set of states \(W\)) and a state \(w \in W\). We will often abbreviate \({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M} := \{v\in W\,|\,\mathbb {M},v\models \varphi \}\). Then:

\(\mathbb {M},w\models p\)

    iff    

\(w \in V(p)\),

\(\mathbb {M},w\models \lnot \varphi \)

    iff    

\(\mathbb {M},w\not \models \varphi \),

\(\mathbb {M},w\models \varphi \wedge \psi \)

    iff    

\(\mathbb {M},w\models \varphi \) and \(\mathbb {M},w\models \psi \),

\(\mathbb {M},w\models K_i\varphi \)

    iff    

for all \(v\in W\): if \((w,v) \in R_i\) then \(\mathbb {M},v\models \varphi \),

\(\mathbb {M},w\models \sum _{\ell =1}^n a_\ell P_i(\varphi _\ell )\ge b\)

    iff    

\(\sum _{\ell =1}^n a_\ell \mu _i(w)({{\mathrm{[\![\!}}}\varphi _\ell {{\mathrm{\!]\!]}}}^\mathbb {M})\ge b\).

Furthermore, we also define:

  • \(\mathbb {M}\models \varphi \) iff \(\mathbb {M},w\models \varphi \) for all \(w\in W\),

  • \(\mathbb {F}\models \varphi \) iff \(\langle \mathbb {F},V\rangle \models \varphi \) for all valuations \(V\) on the frame \(\mathbb {F}\),

  • \(\models \varphi \) iff \(\mathbb {F}\models \varphi \) for all frames \(\mathbb {F}\).

As promised, we will now provide correspondence results for the frame properties defined at the end of the previous subsection:

Lemma 13.1

Consider a probabilistic Kripke frame \(\mathbb {F}\). Then:

  1. 1.

    \(\mathbb {F}\) is \(i\)-consistent iff \(\mathbb {F}\models K_ip\rightarrow P_i(p)=1\),

  2. 2.

    \(\mathbb {F}\) is \(i\)-uniform iff \(\mathbb {F}\models (\varphi \rightarrow K_i\varphi ) \wedge (\lnot \varphi \rightarrow K_i\lnot \varphi )\) for all \(i\)-prob. formulas \(\varphi \).

From a technical perspective, this lemma indicates how the notion of frame correspondence from modal logic [8, 9, 20] can be extended into the probabilistic realm. From an intuitive perspective, this lemma sheds some new light on the various interactions between epistemic and probabilistic information. Probabilistic epistemic logic distinguishes between epistemic impossibility (\((w,v)\notin R_i\)) and probabilistic impossibility (\(\mu _i(w)(v)=0\)). For example, when a fair coin is tossed, an infinite series of tails is probabilistically impossible, but epistemically possible [37, p. 384]. Item 1 of Lemma 13.1 establishes a connection between the principle that knowledge implies certainty, and the property of consistency (epistemic impossibility entails probabilistic impossibility). Similarly, item 2 establishes a connection between the principle that agents know their own probabilistic setup, and the property of uniformity (the impossibility of epistemic uncertainty about probabilities).

Fig. 13.1
figure 1

Componentwise axiomatization of probabilistic epistemic logic

2.3 Proof System

Probabilistic epistemic logic can be axiomatized in a highly modular fashion. An overview is given in Fig. 13.1. The propositional and epistemic components shouldn’t need any further comments. The probabilistic component is a straightforward translation into the formal language \(\fancyscript{L}\) of the well-known Kolmogorov axioms of probability; it ensures that the formal symbol \(P_i(\,\cdot \,)\) behaves like a real probability function. Finally, the linear inequalities component is mainly a technical tool to ensure that the logic is strong enough to capture the behavior of linear inequalities of probabilities.

Using standard techniques the following theorem can be proved [28]:

Theorem 13.1

Probabilistic epistemic logic, as axiomatized in Fig. 13.1, is sound and complete with respect to the class of probabilistic Kripke frames.

The notion of completeness used in this theorem is weak completeness (\(\vdash \varphi \) iff \(\models \varphi \)), rather than strong completeness (\(\varGamma \vdash \varphi \) iff \(\varGamma \models \varphi \)). These two notions do not coincide in probabilistic epistemic logic, because this logic is not compact; for example, every finite subset of the set \(\{P_i(p) >0\} \cup \{P_i(p) \le k \,|\, k > 0\}\) is satisfiable, but the entire set is not.

3 Probabilistic Public Announcement Logic

In this section we discuss a first ‘dynamification’ of probabilistic epistemic logic, by introducing public announcements into the logic. Section 13.3.1 discusses updated probabilistic Kripke models, and introduces a public announcement operator into the formal language to talk about these models. Section 13.3.2 provides a complete axiomatization, and Sect. 13.3.3 focuses on the role of higher-order information in public announcement dynamics.

3.1 Semantics

Public announcements form one of the simplest types of epistemic dynamics. They concern the truthful and public announcement of some piece of information \(\varphi \) by an external source. That the announcement is truthful means that the announced information \(\varphi \) has to be true; that it is public means that all agents \(i\in I\) learn about it simultaneously and commonly. Finally, the announcement’s source is called ‘external’ because it is not one of the agents \(i\in I\) (and will thus not be explicitly represented in the formal language).

Public announcement logic [27, 31, 44] represents these announcements as updates that change Kripke models, and introduces a dynamic public announcement operator into the formal language to describe these updated models. This strategy can straightforwardly be extended into the probabilistic realm.

Syntactically, we add a dynamic operator \([!\,\cdot ]\,\cdot \) to the static language \(\fancyscript{L}\), thus obtaining the new language \(\fancyscript{L}^!\). The formula \([!\varphi ]\psi \) means that after any truthful public announcement of \(\varphi \), it will be the case that \(\psi \). Its dual is defined as \(\langle !\varphi \rangle \psi := \lnot [!\varphi ]\lnot \psi \), and means that \(\varphi \) can truthfully and publicly be announced, and afterwards \(\psi \) will be the case. These formulas thus allow us to express ‘now’ (i.e. before any dynamics has taken place) what will be the case ‘later’ (after the dynamics has taken place). These formulas are interpreted on a probabilistic Kripke model \(\mathbb {M}\) and state \(w\) as follows:

\(\mathbb {M},w\models [!\varphi ]\psi \)

    iff    

if \(\mathbb {M},w\models \varphi \) then \(\mathbb {M}|\varphi ,w\models \psi \),

\(\mathbb {M},w\models \langle !\varphi \rangle \psi \)

    iff    

\(\mathbb {M},w\models \varphi \) and \(\mathbb {M}|\varphi ,w\models \psi \).

Note that these clauses not only use the model \(\mathbb {M}\), but also the updated model \(\mathbb {M}|\varphi \). The model \(\mathbb {M}\) represents the agents’ information before the public announcement of \(\varphi \); the model \(\mathbb {M}|\varphi \) represents their information after the public announcement of \(\varphi \); hence the public announcement of \(\varphi \) itself is represented by the update mechanism \(\mathbb {M}\mapsto \mathbb {M}|\varphi \), which is formally defined as follows:

Definition 13.4

Consider a probabilistic Kripke model \(\mathbb {M} = \langle W,R_i,\mu _i,V\rangle _{i\in I}\), a state \(w\in W\), and a formula \(\varphi \in \fancyscript{L}^!\) such that \(\mathbb {M},w\models \varphi \). Then the updated probabilistic Kripke model \(\mathbb {M}|\varphi := \langle W^\varphi ,R_i^\varphi ,\mu _i^\varphi ,V^\varphi \rangle _{i\in I}\) is defined as follows:

  • \(W^\varphi := W\),

  • \(R_i^\varphi := R_i \cap (W \times {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})\) (for every agent \(i\in I\)),

  • \(\mu _i^\varphi :W^\varphi \rightarrow (W^\varphi \rightarrow [0,1])\) is defined (for every agent \(i\in I\)) by

    $$\mu _i^\varphi (v)(u) := {\left\{ \begin{array}{ll} \frac{\mu _i(v)(\{u\}\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}{\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})} &{}\text {if } \mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})>0 \\ \mu _i(v)(u) &{}\text {if } \mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})=0, \end{array}\right. }$$
  • \(V^\varphi :=V\).

The main effect of the public announcement of \(\varphi \) in a model \(\mathbb {M}\) is that all links to \(\lnot \varphi \)-states are deleted; hence these states are no longer accessible for any of the agents. This procedure is standard; we will therefore focus on the probabilistic components \(\mu _i^\varphi \).

First of all, it should be noted that the case distinction in the definition of \(\mu ^\varphi _i(v)(u)\) is made for strictly technical reasons, viz. to ensure that there are no ‘dangerous’ divisions by 0. In all examples and applications, we will be using the ‘interesting’ case \(\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})>0\). Still, for general theoretical reasons, something has to be said about the case \(\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})=0\). Leaving \(\mu _i^\varphi (v)(u)\) undefined would lead to truth value gaps in the logic, and thus greatly increase the difficulty of finding a complete axiomatization. The approach taken in this chapter is to define \(\mu _i^\varphi (v)(u)\) simply as \(\mu _i(v)(u)\) in case \(\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})=0\)—so the public announcement of \(\varphi \) has no effect whatsoever on \(\mu _i(v)\). The intuitive idea behind this definition is that an agent \(i\) simply ignores new information if she previously assigned probability 0 to it. Technically speaking, this definition will yield a relatively simple axiomatization.

One can easily check that if \(\mathbb {M}\) is a probabilistic Kripke model, then \(\mathbb {M}|\varphi \) is a probabilistic Kripke model as well. We focus on \(\mu ^\varphi (v)\) (for some arbitrary state \(v \in W^\varphi \)). If \(\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})=0\), then \(\mu _i^\varphi (v)\) is \(\mu _i(v)\), which is a probability function on \(W = W^\varphi \). If \(\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})>0\), then for any \(u\in W^\varphi \),

$$\begin{aligned} \mu _i^\varphi (v)(u) = \frac{\mu _i(v)(\{u\}\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}{\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}, \end{aligned}$$

which is positive because \(\mu _i(v)(\{u\}\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})\) is positive, and at most 1, because \(\mu _i(v)(\{u\}\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})\le \mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})\)—and hence \(\mu _i^\varphi (v)(u)\in [0,1]\). Furthermore,

$$\begin{aligned} \sum _{u\in W^\varphi }\mu _i^\varphi (v)(u)=\sum _{u\in W}\frac{\mu _i(v)(\{u\}\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}{\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}=\sum _{\mathbb {M},u\models \varphi }\frac{\mu _i(v)(u)}{\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})} = 1. \end{aligned}$$

It should be noted that the definition of \(\mu _i^\varphi (v)\)—in the interesting case when \(\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})>0\)—can also be expressed in terms of conditional probabilities:

$$\begin{aligned} \mu _i^\varphi (v)(u) = \frac{\mu _i(v)(\{u\}\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}{\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})} = \mu _i(v)(u\,|\,{{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M}). \end{aligned}$$

In general, for any \(X\subseteq W^\varphi \) we have:

$$\begin{aligned} \mu _i^\varphi (v)(X) = \frac{\mu _i(v)(X\cap {{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})}{\mu _i(v)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})} = \mu _i(v)(X\,|\,{{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M}). \end{aligned}$$

In other words, after the public announcement of a formula \(\varphi \), the agents calculate their new, updated probabilities by means of Bayesian conditionalization on the information provided by the announced formula \(\varphi \). This connection between public announcements and Bayesian conditionalization will be explored more thoroughly in Sect. 13.3.3.

Example 13.1

We finish this subsection by discussing a simple example. Consider the following scenario. An agent does not know whether \(p\) is the case, i.e. she cannot distinguish between \(p\)-states and \(\lnot p\)-states. (In fact, \(p\) happens to be true.) Furthermore, the agent has no specific reason to think that one state is more probable than any other; therefore it is reasonable for her to assign equal probabilities to all states. This example can be formalized by the following model: \(\mathbb {M} = \langle W,R,\mu ,V\rangle , W = \{w,v\}, R = W\times W, \mu (w)(w) = \mu (w)(v) = \mu (v)(w) = \mu (v)(v) = 0.5\), and \(V(p)=\{w\}\). (We work with only one agent in this example, so agent indices can be dropped.) This model is a faithful representation of the scenario described above; for example:

$$\begin{aligned} \mathbb {M},w\models \lnot Kp\wedge \lnot K\lnot p\wedge P(p)=0.5\wedge P(\lnot p)=0.5. \end{aligned}$$

Now suppose that \(p\) is publicly announced (this is indeed possible, since \(p\) was assumed to be actually true). Applying Definition 13.4 we obtain the updated model \(\mathbb {M}|p\), with \(W^p=W, R = \{(w,w)\}\), and

$$\begin{aligned} \mu ^p(w)({{\mathrm{[\![\!}}}p{{\mathrm{\!]\!]}}}^{\mathbb {M}|p}) = \mu ^p(w)(w) = \frac{\mu (w)(\{w\}\cap {{\mathrm{[\![\!}}}p{{\mathrm{\!]\!]}}}^\mathbb {M})}{\mu (w)({{\mathrm{[\![\!}}}p{{\mathrm{\!]\!]}}}^\mathbb {M})} = \frac{\mu (w)(w)}{\mu (w)(w)} = 1. \end{aligned}$$

Using this updated model \(\mathbb {M}|p\), we find that

$$\begin{aligned} \mathbb {M},w\models [!p]\big (Kp \wedge P(p)=1\wedge P(\lnot p)=0\big ). \end{aligned}$$

So after the public announcement of \(p\), the agent has come to know that \(p\) is in fact the case. She has also adjusted her probabilities: she now assigns probability 1 to \(p\) being true, and probability 0 to \(p\) being false. These are the results that one would intuitively expect, so Definition 13.4 seems to yield an adequate representation of the epistemic and probabilistic effects of public announcements.

3.2 Proof System

Public announcement logic can be axiomatized by adding a set of reduction axioms to the static base logic [27]. These axioms allow us to recursively rewrite formulas containing dynamic public announcement operators as formulas without such operators; hence the dynamic language \(\fancyscript{L}^!\) is equally expressive as the static \(\fancyscript{L}\). Alternatively, reduction axioms can be seen as ‘predicting’ what will be the case after the public announcement has taken place in terms of what is the case before the public announcement has taken place.

Fig. 13.2
figure 2

Axiomatization of probabilistic public announcement logic

This strategy can be extended into the probabilistic realm. For the static base logic, we do not simply take some system of epistemic logic (usually S5), but rather the system of probabilistic epistemic logic described in Sect. 13.2.3 (Fig. 13.1), and add the reduction axioms shown in Fig. 13.2. The first four reduction axioms are familiar from classical (non-probabilistic) public announcement logic. Note that the reduction axiom for \(i\)-probability formulas makes, just like Definition 13.4, a case distinction based on whether the agent assigns probability 0 to the announced formula \(\varphi \). The significance of this reduction axiom, and its connection with Bayesian conditionalization, will be further explored in the next subsection.

Once again, standard techniques suffice to prove the following theorem [37]:

Theorem 13.2

Probabilistic public announcement logic, as axiomatized in Fig. 13.2, is sound and complete with respect to the class of probabilistic Kripke frames.

3.3 Higher-Order Information in Public Announcements

In this subsection we will discuss the role of higher-order information in probabilistic public announcement logic. This will further clarify the connection, but also the distinction, between (dynamic versions of) probabilistic epistemic logic and probability theory proper.

In the previous subsection we introduced a reduction axiom for \(i\)-probability formulas. This axiom allows us to derive the following principle as a special case:

$$\begin{aligned} (\varphi \wedge P(\varphi )>0)\,\longrightarrow \,\big ([!\varphi ]P_i(\psi )\ge b \leftrightarrow P(\langle !\varphi \rangle \psi )\ge bP_i(\varphi )\big ).\end{aligned}$$
(13.1)

The antecedent states that \(\varphi \) is true (because of the truthfulness of public announcements) and that agent \(i\) assigns it a strictly positive probability (so that we are in the ‘interesting’ case of the reduction axiom). To see the meaning of the consequent more clearly, note that \(\vdash \langle !\varphi \rangle \psi \leftrightarrow (\varphi \wedge [!\varphi ]\psi )\), and introduce the following abbreviation of conditional probability into the formal language:

$$\begin{aligned} P_i(\beta \,|\,\alpha )\ge b \,\,\,\,:=\,\,\,\, P_i(\alpha \wedge \beta )\ge bP_i(\alpha ). \end{aligned}$$

Principle (13.1) can now be rewritten as follows:

$$\begin{aligned} (\varphi \wedge P(\varphi )>0)\,\longrightarrow \,\big ([!\varphi ]P_i(\psi )\ge b \leftrightarrow P([!\varphi ]\psi \,|\,\varphi )\ge b\big ). \end{aligned}$$
(13.2)

A similar version can be proved for \(\le \) instead of \(\ge \); combining these two we get:

$$\begin{aligned} (\varphi \wedge P(\varphi )>0)\,\longrightarrow \,\big ([!\varphi ]P_i(\psi )= b \leftrightarrow P([!\varphi ]\psi \,|\,\varphi )= b\big ). \end{aligned}$$
(13.3)

The consequent thus states a connection between the agent’s probability of \(\psi \) after the public announcement of \(\varphi \), and her conditional probability of \([!\varphi ]\psi \), given the truth of \(\varphi \). In other words, after a public announcent of \(\varphi \), the agent updates her probabilities by Bayesian conditionalization on \(\varphi \). The subtlety of principle (13.3), however, is that the agent does not take the conditional probability (conditional on \(\varphi \)) of \(\psi \) itself, but rather of the updated formula \([!\varphi ]\psi \).

The reason for this is that \([!\varphi ]P_i(\psi )=b\) talks about the probability that the agent assigns to \(\psi \) after the public announcement of \(\varphi \) has actually happened. If we want to describe this probability as a conditional probability, we cannot simply make use of the conditional probability \(P_i(\psi \,|\,\varphi )\), because this represents the probability that the agent would assign to \(\psi \) if a public announcement of \(\varphi \) would happen—hypothetically, not actually! Borrowing a slogan from van Benthem: “The former takes place once arrived at one’s vacation destination, the latter is like reading a travel folder and musing about tropical islands.” [11, p. 417]. Hence, if we want to describe the agent’s probability of \(\psi \) after an actual public announcement of \(\varphi \) in terms of conditional probabilities, we need to represent the effects of the public announcement of \(\varphi \) on \(\psi \) explicitly, and thus take the conditional probability (conditional on \(\varphi \)) of \([!\varphi ]\psi \), rather than \(\psi \).

One might wonder about the relevance of this subtle distinction between actual and hypothetical public announcements. The point is that the public announcement of \(\varphi \) can have effects on the truth value of \(\psi \). For large classes of formulas \(\psi \), this will not occur: their truth value is not affected by the public announcement of \(\varphi \). Formally, this means that \(\vdash \psi \leftrightarrow [!\varphi ]\psi \), and thus (the consequent of) principle (13.3) becomes:

$$\begin{aligned}{}[!\varphi ]P_i(\psi ) = b \leftrightarrow P_i(\psi \,|\,\varphi )=b \end{aligned}$$

—thus wiping away all differences between the agent’s probability of \(\psi \) after a public announcement of \(\varphi \), and her conditional probability of \(\psi \), given \(\varphi \). A typical class of such formulas (whose truth value is unaffected by the public announcement of \(\varphi \)) is formed by the Boolean combinations of proposition letters, i.e. those formulas which express ontic or first-order information. Since probability theory proper is usually only concerned with first-order information (‘no nested probabilities’), the distinction between actual and hypothetical announcements—or in general, between actual and hypothetical learning of new information—thus vanishes completely, and Bayesian conditionalization can be used as a universal update rule to compute new probabilities after (actually) learning a new piece of information.

However, in probabilistic epistemic logic (and its dynamic versions, such as probabilistic PAL), higher-order information is taken into account, and hence the distinction between actual and hypothetical public announcements has to be taken seriously. Therefore, the consequent of principle (13.3) should really use the conditional probability \(P_i([!\varphi ]\psi \,|\,\varphi )\), rather than just \(P_i(\psi \,|\,\varphi )\).Footnote 4

Example 13.2

To illustrate this, consider again the model defined in Example 13.1, and put \(\varphi := p \wedge P(\lnot p) = 0.5\). It is easy to show that

$$\begin{aligned} \mathbb {M},w\models P(\varphi \,|\,\varphi ) = 1 \,\wedge \, P([!\varphi ]\varphi \,|\,\varphi ) = 0 \,\wedge \, [!\varphi ]P(\varphi )=0. \end{aligned}$$

Hence the probability assigned to \(\varphi \) after the public announcement is the conditional probability \(P([!\varphi ]\varphi \,|\,\varphi )\), rather than just \(P(\varphi \,|\,\varphi )\). Note that this example indeed involves higher-order information, since we are talking about the probability of \(\varphi \), which itself contains the probability statement \(P(\lnot p)=0.5\) as a conjunct. Finally, this example also shows that learning a new piece of information \(\varphi \) (via public announcement) does not automatically lead to the agents being certain about (i.e. assigning probability 1 to) that formula. This is to be contrasted with probability theory, where a new piece of information \(\varphi \) is processed via Bayesian conditionalization, and thus always leads to certainty: \(P(\varphi \,|\,\varphi ) = 1\). The explanation is, once again, that probability theory is only concerned with first-order information, whereas the phenomena described above can only occur at the level of higher-order information.Footnote 5 , Footnote 6

4 Probabilistic Dynamic Epistemic Logic

In this section we will move from a probabilistic version of public announcement logic to a probabilistic version of ‘full’ dynamic epistemic logic. Section 13.4.1 introduces a probabilistic version of the product update mechanism that is behind dynamic epistemic logic. Section 13.4.2 introduces dynamic operators into the formal language to talk about these product updates, and discusses a detailed example. Section 13.4.3, finally, shows how to obtain a complete axiomatization in a fully standard (though non-trivial) fashion.

4.1 Probabilistic Product Update

Classical (non-probabilistic) dynamic epistemic logic models epistemic dynamics by means of a product update mechanism [4, 5]. The agents’ static information (what is the current state?) is represented in a Kripke model \(\mathbb {M}\), and their dynamic information (what type of event is currently taking place?) is represented in an update model \(\mathbb {E}\). The agents’ new information (after the dynamics has taken place) is represented by means of a product construction \(\mathbb {M}\otimes \mathbb {E}\). We will now show how to define a probabilistic version of this construction.

Before stating the formal definitions, we show how they naturally arise as probabilistic generalizations of the classical (non-probabilistic) notions. The probabilistic Kripke models introduced in Definition 13.2 represent the agents’ static information, in both its epistemic and its probabilistic aspects. This static probabilistic information is called the prior probabilities of the states in [17]. We can thus say that when \(w\) is the actual state, agent \(i\) considers it epistemically possible that \(v\) is the actual state (\((w,v)\in R_i\)), and, more specifically, that she assigns probability \(b\) to \(v\) being the actual state (\(\mu _i(w)(v)=b)\).

Update models are essentially like Kripke models: they represent the agents’ information about events, rather than states. Since probabilistic Kripke models represent both epistemic and probabilistic information about states, by analogy probabilistic update models should represent both epistemic and probabilistic information about events. Hence, they should not only have epistemic accessibility relations \(R_i\) over their set of events \(E\), but also probability functions \(\mu _i:E\rightarrow (E\rightarrow [0,1])\). (Formal details will be given in Definition 13.5.) We can then say that when \(e\) is the actually occurring event, agent \(i\) considers it epistemically possible that \(f\) is the actually occurring event (\((e,f)\in R_i\)), and, more specifically, that she assigns probability \(b\) to \(f\) being the actually occurring event (\(\mu _i(e)(f)=b\)). This dynamic probabilistic information is called the observation probabilities in van Benthem et al. [17].

Finally, how probable it is that an event \(e\) will occur, might vary from state to state. We assume that this variation can be captured by means of a set \(\varPhi \) of (pairwise inconsistent) sentences in the object language (so that the probability that an event \(e\) will occur can only vary between states that satisfy different sentences of \(\varPhi \)). This will be formalized by adding to the probabilistic update models a set of preconditions \(\varPhi \), and probability functions \(\mathsf {pre}:\varPhi \rightarrow (E\rightarrow [0,1])\). The meaning of \(\mathsf {pre}(\varphi )(e) = b\) is that if \(\varphi \) holds, then event \(e\) occurs with probability \(b\). In van Benthem et al. [17] these are called occurrence probabilities.Footnote 7

We are now ready to formally introduce probabilistic update models:

Definition 13.5

A probabilistic update model is a tuple \(\mathbb {E} = \langle E,R_i,\varPhi ,\mathsf {pre},\mu _i\rangle _{i\in I}\), where \(E\) is a non-empty finite set of events, \(R_i\subseteq E\times E\) is agent \(i\)’s epistemic accessibility relation, \(\varPhi \subseteq \fancyscript{L^\otimes }\) is a finite set of pairwise inconsistent sentences called preconditions, \(\mu _i:E\rightarrow (E\rightarrow [0,1])\) assigns to each event \(e\in E\) a probability function \(\mu _i(e)\) over \(E\), and \(\mathsf {pre}:\varPhi \rightarrow (E\rightarrow [0,1])\) assigns to each precondition \(\varphi \in \varPhi \) a probability function \(\mathsf {pre}(\varphi )\) over \(E\).

All components of a probabilistic update model have already been commented upon. Note that we use the same symbols \(R_i\) and \(\mu _i\) to indicate agent \(i\)’s epistemic and probabilistic information in a probabilistic Kripke model \(\mathbb {M}\) and in a probabilistic update model \(\mathbb {E}\)—from the context it will always be clear which of the two is meant. The language \(\fancyscript{L}^\otimes \) that the preconditions are taken from will be formally defined in the next subsection. (As is usual in this area, there is a non-vicious simultaneous recursion going on here.)

We now introduce occurrence probabilities for events at states:

Definition 13.6

Consider a probabilistic Kripke model \(\mathbb {M}\), a state \(w\), a probabilistic update model \(\mathbb {E}\), and an event \(e\). Then the occurrence probability of e at w is defined as

$$\mathsf {pre}(w)(e) = {\left\{ \begin{array}{ll} \,\,\mathsf {pre}(\varphi )(e) &{} \text {if } \varphi \in \varPhi \text { and } \mathbb {M},w\models \varphi \\ \,\,0 &{} \text {if there is no } \varphi \in \varPhi \text { such that } \mathbb {M},w\models \varphi . \end{array}\right. } $$

Since the preconditions are pairwise inconsistent, \(\mathsf {pre}(w)(e)\) is always well-defined. The meaning of \(\mathsf {pre}(w)(e) = b\) is that in state \(w\), event \(e\) occurs with probability \(b\). Note that if two states \(w\) and \(v\) satisfy the same precondition, then always \(\mathsf {pre}(w)(e)= \mathsf {pre}(v)(e)\); in other words, the occurrence probabilities of an event \(e\) can only vary ‘up to a precondition’ (cf. supra).

The probabilistic product update mechanism can now be defined as follows:

Definition 13.7

Consider a probabilistic Kripke model \(\mathbb {M} = \langle W,R_i,\mu _i,V\rangle _{i\in I}\) and a probabilistic update model \(\mathbb {E} = \langle E,R_i,\varPhi ,\mathsf {pre}, \mu _i\rangle _{i\in I}\). Then the updated model \(\mathbb {M}\otimes \mathbb {E} := \langle W',R_i',\mu _i',V'\rangle _{i\in I}\) is defined as follows:

  • \(W' := \{(w,e)\,|\,w\in W, e\in E, \mathsf {pre}(w)(e)>0\}\),

  • \(R'_i := \{((w,e), (w',e')) \in W'\times W' \,|\, (w,w') \in R_i \text { and } (e,e') \in R_i\}\)

    (for every agent \(i\in I\)),

  • \(\mu _i':W' \rightarrow (W' \rightarrow [0,1])\) is defined (for every agent \(i\in I\)) by

    $$\begin{aligned} \mu _i'(w,e)(w',e') := \frac{\mu _i(w)(w')\cdot \mathsf {pre}(w')(e')\cdot \mu _i(e)(e')}{\sum _{\begin{array}{c} w''\in W\\ e''\in E \end{array}} \mu _i(w)(w'')\cdot \mathsf {pre}(w'')(e'')\cdot \mu _i(e)(e'')} \end{aligned}$$

    if the denominator is strictly positive, and \(\mu _i'(w,e)(w',e'):=0\) otherwise,

  • \(V'(p) := \{(w,e)\in W' \,|\, w\in V(p)\}\) (for every \(p\in Prop\)).

We will only comment on the probabilistic component of this definition (all other components are fully classical). After the dynamics has taken place, agent \(i\) calculates at state \((w,e)\) her new probability for \((w',e')\) by taking the arithmetical product of (i) her prior probability for \(w'\) at \(w\), (ii) the occurrence probability of \(e'\) in \(w'\), and (iii) her observation probability for \(e'\) at \(e\), and then normalizing this product. The factors in this product are not weighted (or equivalently, they all have weight 1)—van Benthem et al. [17] also discusses weighted versions of this update mechanism, and shows how one of these weighted versions corresponds to the rule of Jeffrey conditioning from probability theory [36]. Finally, note that \(\mathbb {M}\otimes \mathbb {E}\) might fail to be a probabilistic Kripke model: if the denominator in the definition of \(\mu _i'(w,e)\) is 0, then \(\mu _i'(w,e)\) assigns 0 to all states in \(W'\). We will not care here about the interpretation of this feature, but only remark that technically speaking it is harmless and, perhaps most importantly, still allows for a reduction axiom for \(i\)-probability formulas (cf. Sect. 13.4.3).

4.2 Language and Semantics

To talk about these updated models, we add dynamic operators \([\mathsf {E,e}]\) to the static language \(\fancyscript{L}\), thus obtaining the new language \(\fancyscript{L}^\otimes \). Here, \(\mathsf {E,e}\) are formal names for the probabilistic update model \(\mathbb {E} = \langle E,R_i,\varPhi ,\mathsf {pre},\mu _i\rangle _{i\in I}\) and event \(e\in E\) (recall our remark about the mutual recursion of the dynamic language and the updated models). The formula \([\mathsf {E,e}]\varphi \) means that after the event \(e\) has occurred, it will be the case that \(\varphi \). It has the following semantics:

\(\mathbb {M},w\models [\mathsf {E,e}]\psi \)

    iff    

if \(\mathsf {pre}(w)(e) > 0\), then \(\mathbb {M}\otimes \mathbb {E},(w,e)\models \psi .\)

Example 13.3

Consider the following scenario. While strolling through a flee market, you see a painting that you think might be a real Picasso. Of course, the chance that the painting is actually a real Picasso is very slim, say 1 in 100,000. You know from an art encyclopedia that Picasso signed almost all his paintings with a very characteristic signature. If the painting is a real Picasso, the chance that it bears the characteristic signature is 97 %, while if the painting is not a real Picasso, the chance that it bears the characteristic signature is 0 % (nobody is capable of imitating Picasso’s signature). You immediately look at the painting’s signature, but determining whether it is Picasso’s characteristic signature is very hard, and—not being an expert art historian—you remain uncertain and think that the chance is 50 % that the painting’s signature is Picasso’s characteristic one.

Your initial information (before having looked at the painting’s signature) can be represented as the following probabilistic Kripke model: \(\mathbb {M} = \langle W,R,\mu ,V\rangle \), where \(W = \{w,v\}, R = W\times W, \mu (w)(w) = \mu (v)(w) = 0.00001,\mu (w)(v) = \mu (v)(v) = 0.99999\), and \(V(\mathsf {real})=\{w\}\). (We work with only one agent in this example, so agent indices can be dropped.) Hence, initially you do not rule out the possibility that the painting in front of you is a real Picasso, but you consider it highly unlikely:

$$\begin{aligned} \mathbb {M},w\models \hat{K}\mathsf {real} \wedge P(\mathsf {real}) = 0.00001. \end{aligned}$$

The event of looking at the signature can be represented with the following update model: \(\mathbb {E} = \langle E,R,\varPhi ,\mathsf {pre},\mu \rangle \), where \(E = \{e,f\}\), \(R = E\times E\), \(\varPhi = \{\mathsf {real},\lnot \mathsf {real}\}\), \(\mathsf {pre}(\mathsf {real})(e) = 0.97\), \(\mathsf {pre}(\mathsf {real})(f) = 0.03\), \(\mathsf {pre}(\lnot \mathsf {real})(e) = 0\), \(\mathsf {pre}(\lnot \mathsf {real})(f) = 1\), and \(\mu (e)(e) = \mu (f)(e) = \mu (e)(f) = \mu (f)(f)=0.5\). The event \(e\) represents ‘looking at Picasso’s characteristic signature’; the event \(f\) represents ‘looking at a signature that is not Picasso’s characteristic one’.

We now construct the updated model \(\mathbb {M}\otimes \mathbb {E}\). Since \(\mathbb {M},v\not \models \mathsf {real}\), it holds that \(\mathsf {pre}(v)(e)=\mathsf {pre}(\lnot \mathsf {real})(e) = 0\), and hence \((v,e)\) does not belong to the updated model. It is easy to see that the other states \((w,e),(w,f)\) and \((v,f)\) do belong to the updated model. Furthermore, one can easily calculate that \(\mu '(w,e)(w,e) = 0.0000003\) and \(\mu '(w,e)(w,f) = 0.0000097\), so \(\mu '(w,e)({{\mathrm{[\![\!}}}\mathsf {real}{{\mathrm{\!]\!]}}}^{\mathbb {M}\otimes \mathbb {E}}) = 0.0000003 + 0.0000097 = 0.00001\), and thus

$$\begin{aligned} \mathbb {M},w\models [\mathsf {E,e}]P(\mathsf {real}) = 0.00001. \end{aligned}$$

Hence, even though the painting in front of you is a real Picasso (in state \(w\)), after looking at the signature (which is indeed Picasso’s characteristic signature!—the event that actually happened was event \(e\)) you still assign a probability of 1 in 100,000 to it being a real Picasso.

Note that if you had been an expert art historian, with the same prior probabilities, but with the reliable capability of recognizing Picasso’s characteristic signature—let’s formalize this as \(\mu (e)(e) = 0.99\) and \(\mu (e)(f)=0.01\)—, then the same update mechanism would have implied that

$$\begin{aligned} \mathbb {M},w\models [\mathsf {E,e}]P(\mathsf {real})=0.00096. \end{aligned}$$

In other words, if you had been an expert art historian, then looking at the painting’s signature would have been highly informative: it would have led to a significant change in your probabilities.

4.3 Proof System

A complete axiomatization for probabilistic dynamic epistemic logic can be found using the standard strategy, viz. by adding a set of reduction axioms to static probabilistic epistemic logic. Implementing this strategy, however, is not entirely trivial. The reduction axioms for non-probabilistic formulas are familiar from classical (non-probabilistic) dynamic epistemic logic, but the reduction axiom for \(i\)-probability formulas is more complicated.

First of all, this reduction axiom makes a case distinction on whether a certain sum of probabilities is strictly positive or not. We will show that this corresponds to the case distinction made in the definition of the updated probability functions (Definition 13.7). In the definition of \(\mu _i'(w,e)\), a case distinction is made on the value of the denominator of a fraction, i.e. on the value of the following expression:

$$\begin{aligned} \sum _{\begin{array}{c} v\in W\\ f\in E \end{array}} \mu _i(w)(v)\cdot \mathsf {pre}(v)(f)\cdot \mu _i(e)(f). \end{aligned}$$
(13.4)

But this expression can be rewritten as

$$\begin{aligned} \sum _{\begin{array}{c} v\in W\\ f\in E\\ \varphi \in \varPhi \\ \mathbb {M},v\models \varphi \end{array}} \mu _i(w)(v)\cdot \mathsf {pre}(\varphi )(f)\cdot \mu _i(e)(f). \end{aligned}$$

Using the definition of \(k_{i,e,\varphi ,f}\) (cf. Fig. 13.3), this can be rewritten as

$$ \sum _{\begin{array}{c} \varphi \in \varPhi \\ f\in E \end{array}} \mu _i(w)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})\cdot k_{i,e,\varphi ,f}. $$

Since \(E\) and \(\varPhi \) are finite, this sum is finite and corresponds to an expression in the formal language \(\fancyscript{L}^\otimes \), which we will abbreviate as \(\sigma \):

$$ \sigma := \sum _{\begin{array}{c} \varphi \in \varPhi \\ f\in E \end{array}} k_{i,e,\varphi ,f}P_i(\varphi ). $$

This expression can be turned into an \(i\)-probability formula by ‘comparing’ it with a rational number \(b\); for example \(\sigma \ge b\). Particularly important are the formulas \(\sigma = 0\) and \(\sigma >0\): exactly these formulas are used to make the case distinction in the reduction axiom for \(i\)-probability formulas.Footnote 8

Fig. 13.3
figure 3

Axiomatization of probabilistic dynamic epistemic logic

Next, the reduction axiom for \(i\)-probability formulas provides a statement in each case of the case distinction: \(0\ge b\) in the case \(\sigma =0\), and \(\chi \) (as defined in Fig. 13.3) in the case \(\sigma > 0\). We will only explain the meaning of \(\chi \) in the ‘interesting’ case \(\sigma >0\). If \(\mathbb {M},w\models \sigma >0\), then the value of (13.4) is strictly positive (cf. supra), and we can calculate:

\(\mu _i'(w,e)({{\mathrm{[\![\!}}}\psi {{\mathrm{\!]\!]}}}^{\mathbb {M}\otimes \mathbb {E}})\)

\(=\)

\(\sum _{\mathbb {M}\otimes \mathbb {E},(w',e')\models \psi }\mu _i'(w,e)(w',e')\)

 

\(=\)

\(\sum _{\begin{array}{c} w'\in W, e'\in E\\ \mathbb {M},w'\models \langle \mathsf {E,e'}\rangle \psi \end{array}} \frac{\mu _i(w)(w')\cdot \mathsf {pre}(w')(e') \cdot \mu _i(e)(e')}{\sum _{\begin{array}{c} v\in W\\ f\in E \end{array}} \mu _i(w)(v)\cdot \mathsf {pre}(v)(f)\cdot \mu _i(e)(f)}\)

 

\(=\)

\(\frac{\sum _{\begin{array}{c} \varphi \in \varPhi \\ f\in E \end{array}} \mu _i(w)({{\mathrm{[\![\!}}}\varphi \wedge \langle \mathsf {E,f}\rangle \psi {{\mathrm{\!]\!]}}}^\mathbb {M})\cdot k_{i,e,\varphi ,f}}{\sum _{\begin{array}{c} \varphi \in \varPhi \\ f\in E \end{array}} \mu _i(w)({{\mathrm{[\![\!}}}\varphi {{\mathrm{\!]\!]}}}^\mathbb {M})\cdot k_{i,e,\varphi ,f}}\).

Hence, in this case (\(\sigma >0\)) we can express that \(\mu _i'(w,e)({{\mathrm{[\![\!}}}\psi {{\mathrm{\!]\!]}}}^{\mathbb {M}\otimes \mathbb {E}})\ge b\) in the formal language, by means of the following \(i\)-probability formula:

$$\begin{aligned} \sum _{\begin{array}{c} \varphi \in \varPhi \\ f\in E \end{array}} k_{i,e,\varphi ,f} P_i(\varphi \wedge \langle \mathsf {E,f}\rangle \psi ) \ge \sum _{\begin{array}{c} \varphi \in \varPhi \\ f\in E \end{array}} bk_{i,e,\varphi ,f}P_i(\varphi ). \end{aligned}$$

Moving to linear combinations, we can express that \(\sum _\ell a_\ell \mu _i'(w,e)({{\mathrm{[\![\!}}}\psi _\ell {{\mathrm{\!]\!]}}}^{\mathbb {M}\otimes \mathbb {E}})\ge b\) in the formal language using an analogous \(i\)-probability formula, namely \(\chi \) (cf. the definition of this formula in Fig. 13.3).

We thus obtain the following theorem [17]:

Theorem 13.3

Probabilistic dynamic epistemic logic, as axiomatized in Fig. 13.3, is sound and complete with respect to the class of probabilistic Kripke frames.

5 Further Developments and Applications

Probabilistic extensions of dynamic epistemic logic are a recent development, and there are various open questions and potential applications to be explored. In this section we discuss a selection of such topics for further research; more suggestions can be found in [17] and [15, ch. 8].

We distinguish between technical and conceptual open problems.Footnote 9 A typical technical problem that needs further research is the issue of surprising information. In the update mechanisms described in this chapter, the agents’ new probabilities are calculated by means of a fraction whose denominator might take on the value 0. The focus has been on the ‘interesting’ (non-0) cases, and the 0-case has been treated as mere ‘noise’: a technical artefact that cannot be handled convincingly by the system. However, sometimes such 0-cases do represent very intuitive scenarios; for example, one can easily think of an agent being absolutely certain that a certain proposition \(\varphi \) is false (\(P(\varphi ) = 0\)), while that proposition is actually true, and can thus be announced! In such cases, the system of probabilistic public announcement logic described in Sect. 13.3 predicts that the agent will simply ignore the announced information (rather than performing some sensible form of belief revision). More can, and should be said about such cases [2, 6, 46].

Another technical question is whether other representations of soft information can learn something from the probabilistic approach to dynamic epistemic logic. Probabilistic Kripke models represent the agents’ soft information via the probability functions \(\mu _i\), and interpret formulas of the form \(P_i(\varphi ) \ge b\). Plausibility models, on the other hand, represent the agents’ soft information via a (non-numerical) plausibility ordering \(\le _i\), and interpret more qualitative notions of belief [7, 12, 15, 22]. In particular, if we use \(Min_{\le _i}(X)\) to denote the set of \(\le _i\)-minimal states in the set \(X\), then the formula \(B_i\varphi \) is interpreted in a plausibility model \(\mathbb {M}\) as follows:Footnote 10

$$\begin{aligned} \mathbb {M},w \models B_i\varphi \,\,\, \text { iff } \,\,\, \text {for all } v \in Min_{\le _i}(R_i[w]):\mathbb {M},v\models \varphi . \end{aligned}$$

The product update for probabilistic Kripke models described in Definition 13.7 takes into account prior probabilities (\(\mu _i(w)(v)\) for states \(w\) and \(v\)), observation probabilities (\(\mu _i(e)(f)\) for events \(e\) and \(f\)), and occurrence probabilities (\(\mathsf {pre}(w)(e)\) for a state \(w\) and event \(e\)). One can also define a product update for plausibility models; a widely used rule to define the updated plausibility ordering is the so-called ‘priority update’ [7, 15]:

$$\begin{aligned} (w,e) \le _i (v,f) \,\,\, \text { iff } \,\,\, e <_i f \text { or } (e \cong _i f \text { and } w \le _i v). \end{aligned}$$

The updated plausibility ordering thus gives priority to the plausibility ordering on events, and otherwise preserves the original plausibility ordering on states as much as possible. In analogy with the probabilistic setting, the plausibility orderings on states and events can be called the ‘prior plausibility’ and ‘observation plausibility’, respectively. However, the notion of occurrence probability does not seem to have a clear analogue in the framework of plausibility models. van Benthem [16] defines a notion of ‘occurrence plausibility’, which can be expressed as \(e \le _w f\): at state \(w\), event \(e\) is at least as plausible as \(f\) to occur (this ordering is not agent-dependent; recall Footnote 7). New product update rules thus have to merge three plausibility orderings: prior plausibility, observation plausibility, and occurrence plausibility. van Benthem [16] makes some proposals for such rules, but finding a fully satisfactory definition remains a major open problem in this area.

An important conceptual issue that is currently actively being investigated, is the exact relation between the quantitative (probabilistic) and qualitative perspectives on soft information. A widespread proposal is to connect belief with high probability, where ‘high’ means ‘above some treshold \(\tau \in (0.5,1]\)’; formally: \(B_i\varphi \Leftrightarrow P_i(\varphi )\ge \tau \). An immediate problem of this proposal is that belief is standardly taken to be closed under conjunction, while ‘high probability’ is not closed under conjunction (unless \(\tau =1\)). Despite this initial problem, there’s also a lot in favor of this proposal. Plausibility models not only interpret a notion of belief, but also a notion of conditional belief: \(B_i^\alpha \varphi \) means that agent \(i\) believes that \(\varphi \), conditional on \(\alpha \). The connection between belief and high probability can perfectly be extended to conditional belief and high conditional probability:

$$\begin{aligned} B_i^\alpha \varphi \,\,\, \Leftrightarrow \,\,\, P_i(\varphi \,|\,\alpha )\ge \tau .\end{aligned}$$
(13.5)

Furthermore, (conditional) belief and high (conditional) probability seem to display highly similar dynamic behavior under public announcements. We saw in Sect. 13.3.3 that \([!\varphi ]P_i(\psi )\ge \tau \) can sometimes diverge in truth value from \(P_i(\psi \,|\,\varphi )\ge \tau \), because of the presence of higher-order information. In exactly the same way (and for the same reason), \([!\varphi ]B_i\psi \) and \(B_i^\varphi \psi \) can diverge in truth value on plausibility models. Furthermore, (conditional) belief and high (conditional) probability have exactly the ‘same’ reduction axiom. This means that (13.6) (which is interpreted on probabilistic Kripke models) and (13.7) (which is interpreted on plausibility models) are intertranslatable, using principle (13.5) above:

$$\begin{aligned}{}[!\varphi ]P_i(\psi \,|\,\alpha )\ge \tau&\,\,\, \leftrightarrow \,\,\, \Big (\varphi \rightarrow P_i(\langle !\varphi \rangle \psi \,|\,\langle !\varphi \rangle \alpha )\ge \tau \Big ), \end{aligned}$$
(13.6)
$$\begin{aligned} B_i^\alpha \psi&\,\,\, \leftrightarrow \,\,\, \Big (\varphi \rightarrow B_i^{\langle !\varphi \rangle \alpha }\langle !\varphi \rangle \psi \Big ). \end{aligned}$$
(13.7)

The significance of these observations is further discussed in [24].

Several fruitful applications of probabilistic dynamic epistemic logic can be expected in the fields of game theory and cognitive science. In recent years, dynamic epistemic logic has been widely applied to explore the epistemic foundations of game theory [10, 13, 18]. However, given the importance of probability in game theory (for example, in the notion of mixed strategy), it is surprising that very few of these logical analyses have a probabilistic component.Footnote 11 Probabilistic dynamic epistemic logic provides the required tools to explore the epistemic and the probabilistic aspects of game theory.

For example, [21, 23] uses a version of probabilistic public announcement logic to analyze the role of common knowledge and communication in Aumann’s well-known agreeing to disgaree theorem. Classically, this theorem is stated as follows: “If two people have the same prior, and their posteriors for an event [\(\varphi \)] are common knowledge, then these posteriors are equal” [3, p. 1236]. If we represent the experiments (with respect to which the agents’ probabilities are called ‘prior’ and ‘posterior’) with a dynamic operator \([\mathrm {EXP}]\), then this version can be formalized as (13.8), which is derivable in the underlying logical system:

$$\begin{aligned}{}[\mathrm {EXP}] C\big (P_1(\varphi )=a\wedge P_2(\varphi )=b\big ) \rightarrow a=b. \end{aligned}$$
(13.8)

However, this version does not say how the agents are to obtain this common knowledge; it just assumes that they have been able to obtain it one way or another. The way to obtain common knowledge is via a certain communication protocol, which is described explicitly in the intuitive scenario that is used to motivate and explain this theorem. Once this communication dynamics is made explicitly part of the story, common knowledge of the posteriors need no longer be assumed in the formulation of the agreement theorem, since it will now simply follow from the communication protocol. If we represent the communication protocol with a dynamic operator \([\mathrm {DIAL}(\varphi )]\), this new version of the theorem can be formalized as (13.9):

$$\begin{aligned}{}[\mathrm {EXP}] [\mathrm {DIAL}(\varphi )]\big (P_1(\varphi )=a\wedge P_2(\varphi )=b\big ) \rightarrow a=b. \end{aligned}$$
(13.9)

The notion of common knowledge is thus less central to the agreement theorem than is usually thought: if we compare (13.8) and (13.9), it is clear that common knowledge and communication are two sides of the same coin; the former is only needed to formulate the agreement theorem if the latter is not represented explicitly.

Another potential field of application is cognitive science. The usefulness of (epistemic) logic for cognitive science has been widely recognized [14, 35, 43]. Of course, as in any other empirical discipline, one quickly finds out that real-life human cognition is rarely a matter of all-or-nothing, but often involves degrees (probabilities). Furthermore, a recent development in cognitive science is toward probabilistic (Bayesian) models of cognition [42]. If epistemic logic is to remain a valuable tool here, it will thus have to be a thoroughly ‘probabilized’ version. For example, probabilistic dynamic epistemic logic has been used to model the cognitive phenomenon of surprise and its epistemic aspects [25, 39].

6 Conclusion

In this chapter we have introduced probabilistic epistemic logic, and several of its dynamic versions. These logics provide a standard epistemic (possible-worlds) analysis of the agents’ hard information, and supplement it with a fine-grained probabilistic analysis of their soft information. Higher-order information of any kind (knowledge about probabilities, probabilities about knowledge, etc.) is represented explicitly. The importance of higher-order information in dynamics is clear from our discussion of the connection between public announcements and Bayesian conditionalization. The probabilistic versions of both public announcement logic and dynamic epistemic logic with product updates can be completely axiomatized in a standard way (via reduction axioms). The fertility of the research program of probabilistic dynamic epistemic logic is illustrated by the variety of technical and conceptual issues that are still open for further research, and its (potential) use in analyzing theorems and phenomena from game theory and cognitive science.