1 Introduction

Surprise is the automatic reaction to a mismatch. It is a (felt) reaction/response of alert and arousal due to an inconsistency (discrepancy, mismatch, non-assimilation, lack of integration) between an incoming input and our previous knowledge, in particular an actual or potential prediction. It invokes and mobilizes resources for a better epistemic processing of this strange information (attention, search, belief revision, etc.), but also for coping with the potential threat. Surprise is aimed at solving the inconsistency and at preventing possible dangers (the reason for the alarm) due to a lack of predictability and to a wrong anticipation.

Moreover, there are different kinds and levels of surprise. There is a first-hand surprise—the most peripheral one, just due to the perceptual mismatch between what the agent sees and its sensory-motor expectations; while the deeper and slower forms of surprise are due to symbolic representations of expected events, and to the process of information integration with previous long-term knowledge and explanation of the perceived data (Meyer et al. 1997). This surprise is due to the implausibility of the new information. Low level predictions are based on some form of statistical learning, on frequency and regular sequences, on judgment of normality in direct perceptual experience, on the strength of associative links and on the probability of activation (Kahneman and Miller 1986), or on mental simulation. On the other hand, high level predictions have many different sources: analogy (“The first time I saw him he was very elegant, I think that he will be well dressed”) and, in general, inferences and reasoning (“He is Italian thus he will love pasta”), natural laws, and—in the social domain—norms, roles, conventions, habits, scripts (“He will not do so; here it is prohibited”), or Theory of Mind capacities (“He likes Mary, so he will invite her for a dinner; He decided to go on vacation, so he will not be here on Monday”).

In this work we are mainly interested in the analysis of those forms of surprise which involve symbolic high-level representations of expected events or objects and a recognized event or object in the external world. We are not going to analyze those forms of surprise due to the mismatch between low-level sensory expectations and a raw perceptual input (raw sensor datum). We restrict our analysis to those forms of cognitive surprise involving an already perceived and recognized object or event. Footnote 1 In order to account for the process of cognitive recognition we have developed in our complementary work (Lorini and Castelfranchi 2006b) an abduction-based procedure of explanation assessment and selection. This procedure has the function of returning the best explanation of the data obtained by the sensors. We have shown that this selected explanation can mismatch with some pre-existent cognitive representation and therefore be responsible for the generation of surprise. In this work we do not introduce any abduction-based procedure of explanation selection and we simply assume that an agent can directly perceive and recognize an object or event without interpreting the perceptual raw sensor data by means of some abductive procedure.

In Sect. 2 a formal logic of beliefs and probabilities is introduced. This simple logic is developed in order to provide formal representations of several kinds of mental attitudes. We provide definitions for beliefs and expectations of an agent. Moreover, we characterize the notion of scrutinized expectation, i.e. the expectation on which the agent is focusing its attention and that the agent tries to match with the perceptual data. We introduce: (1) the special kind of mental operation retrieve with the function of introducing a new expectation into the mental test (scrutiny) space of the agent; (2) the special action of perceiving some fact with the function of modifying the agent’s perceptual data.

In Sect. 3 we analyze two different kinds of surprise which involve all informational mental states characterized in Sect. 2. We argue that these forms of surprise are the basic forms of surprise in cognitive systems involving symbolic high-level representations of expected events. Footnote 2

  1. 1.

    Mismatch-based surprise (given the conflict between a perceived fact and a scrutinized representation). I’m actively checking whether a certain event is happening, that is I have an endogenous anticipatory explicit representation of the next input and I attempt to match the incoming data against it. If there is a mismatch (conflict) between the two representations there is surprise. The intensity of this form of surprise is a function of the probability assigned to the expectation conflicting with the perceived fact.

  2. 2.

    Astonishment or surprise in recognition. I perceive a certain fact and recognize the implausibility of this. The recognition of implausibility of the perceived fact can be based on two different kinds of mental processes.

    1. (a)

      On the one side, after perceiving a certain fact φ that I was not actively expecting, I can retrieve from my background knowledge the probability of the event and conclude that “I would not have expected that event.” The intensity of astonishment is a function of the probability assigned to the negation of the perceived fact (\({\neg\varphi}\)).

    2. (b)

      On the other side after perceiving a certain fact φ I infer from my explicit beliefs the negation of the perceived fact.

We will argue that the previous typology of surprise is based on the characterization of different kinds of informational mental states Footnote 3, as summarised in Fig. 1.

Fig. 1
figure 1

Ontology of informational mental states

According to our view an agent has a representation under scrutiny (a focused expectation) and this must be distinguished from all those accessible representations and expectations in the background (at an unconscious and automatic level). This distinction between expectations and beliefs under scrutiny and background expectations and beliefs looks similar to Kahneman and Tversky’s distinction (Kahneman and Tversky 1982) between active expectations and passive expectations. According to Kahneman and Tversky the former occupy consciousness and draw on the limited capacity of attention; the latter kind are available at a mere automatic and effortless level. Passive expectations could be the product of priming. Moreover, an agent looks at the world and acts in the world within the assumption of a presupposed complex mental framework, of a given presupposed frame (or script) which represents its unproblematic interpretation of the context of the situation where its action and perception are situated and which supports all (focused and background) expectations. Thus when presupposing to enter in a restaurant the agent can reasonably expect to perceive a waiter, some tables and so on.

Expectations and beliefs under scrutiny, background expectations and beliefs, and presupposed frame are members of the general set of explicit informational mental states. The last category of informational mental states is the category of implicit expectations and beliefs, that is all those (potential) beliefs and expectations that can be inferred from explicit beliefs and expectations (Ortony and Partridge 1987; Levesque 1984).

In Sect. 4 we try to build a bridge between our model of surprise and the theory of belief change. We start by extending the formal language and introducing update processes whose function is the modification of beliefs and expectations of the agent after the perception of a new fact. We extensively investigate the role of surprise in triggering belief change and we provide cognitive principles governing this kind of cognitive phenomenon.

Let us sum up here the major claims of our work.

Surprise is a very relevant belief-based phenomenon, with mental and behavioral aspects. In order to understand it, it is necessary to model the relationships between basic properties of beliefs (like their strength, their being explicit or implicit, their being passively assumed or actively tested) and surprise kinds and dimensions. Surprise is a belief-based phenomenon because it is based on an actual or potential prediction formulated on the basis of the other beliefs, and because one of its main effects is the revision of our assumptions: the new data must be assimilated in our knowledge base, and our beliefs (base of our wrong prediction) must be revised.

In the literature there are very good studies on surprise, which articulate several insightful hypotheses: like the idea that surprise depends on expectations, or the claim that its intensity depends on the “unexpectedness” of the stimulus (Ortony and Partridge 1987; Meyer et al. 1997), the claim that one can deal with this in terms of information or probability, or the claim that there are different kinds of expectations—active vs. passive, explicit vs. implicit—which produce different kinds of surprise. However, we argue that more careful distinctions and clear characterizations of surprise are needed. We present here a ‘theory driven’ account of surprise, an analytical cognitive model which allows us to predict and distinguish different levels and kinds of surprise, not necessarily already discriminated in the empirical researches. Sometimes even common sense concepts look much richer: for example, the concept of “astonished” is not identical to the concept of “surprised,” or to the concept of “being disoriented.” We try to provide some principled and precise distinctions of these different levels and kinds of surprise and to formalize some relevant properties of them. It is not for example the same kind of surprise when we immediately recover from it, while saying “How stupid I am! It is obvious! I should have expected this,” and when we remain in a strong and long-suspended situation, unable to realize/accept and understand what has happened. Of course, we take into account some important psychological models (Ortony and Partridge 1987; Meyer et al. 1997; Reisenzein et al. 1996), which are very relevant and interesting, and also currently implemented models in AI (Macedo and Cardoso 2001). But they are still quite poor and simplified. They for example do not clearly distinguish between the surprise relative to the invalidation of a strong anticipated expectation, and the surprise relative to the degree of “unexpectedness” of the new incoming data. The two processes are—in our model—related and partially complementary, but not identical.

A complex view of surprise and of its nature and functions is necessary to understand the phenomenon. We do not model all its aspects. We do not investigate the experiential, phenomenal character of surprise (Reisenzein 2000): surprise as a felt signal. Footnote 4 Moreover, we do not model functional aspects of surprise (alert, learning, etc.) except those related to belief reconsideration (Sect. 4). As some psychologists have stressed (Meyer et al. 1991, 1997), surprise can culminate in a process of belief change. In this work we want to try to suggest some interesting ways to integrate a formal and computational model of surprise (viewed as a belief-based phenomenon) with a formal model of belief change. Indeed belief revision theory has been mostly focused on the problem of finding general principles characterizing the process of belief change, but has completely neglected to account for the causal precursors of this kind of process.

2 Formal bases

We define in this section the formal language with the related syntax and semantics. We use a logic of probabilistic quantified beliefs with a semantics similar to the one presented in Fagin and Halpern (1994) and Halpern (2003). We add to the basic formal language the standard dynamic operator for talking about actions. Moreover, we use special formal constructs to denote the representation under scrutiny (or under test) of a given agent and the agent’s perceptual data collected by its sensors.

We characterize two special kinds of actions: the mental operation retrieve which has the function of moving new information into the scrutiny (test) space;—the action perceive which has the function of modifying the agent’s perceptual data. The main function of the formalism is to disambiguate the relevant concepts and notions of our model of surprise.

2.1 Syntax

The primitives of the formal language are the following:

  • A set of atomic actions AT = {a,b,...}.

  • A set of propositional variables Π = {p,q,...}.

The set PROP = {φ,ψ,...} is the set of propositional formulas defined by the closure of Π under the Boolean operations ∧ and \({\neg}\) . On the one hand OBS is the set of perceptual actions defined as the smallest set such that:

  • if φ∈PROP then observe(φ)∈OBS.

On the other hand RETR is the set of retrieve mental operations defined as the smallest set such that:

  • if φ∈PROP then retrieve(φ)∈RETR.

ACT = {α,β,...} is the set of actions which is defined as the smallest set such that:

  • \({ AT\subseteq ACT}\) ;

  • \({ OBS\subseteq ACT}\) ;

  • \({ RETR\subseteq ACT}\) ;

  • if α and β∈ACT then α;β∈ACT (sequential composition).

Our language \({\mathcal{L}_{\mathcal{SURP}}}\) is given by the following rule in extended Backus–Naur Form:

$$ \Phi::=p|\neg\Phi|\Phi_{1}\wedge\Phi_{2}|Bel\Phi|\left[\alpha\right]\Phi|d_{1}P(\Phi_{1})+\cdots+d_{n}P(\Phi_{n})\geq c|Test(\varphi)|Datum(\varphi) $$

where p∈Π, φ∈PROP, α∈ACT and d 1,...,d n ,c are real numbers. A primitive term is a an expression of the form P(Φ). A basic probability formula is a statement of the form \({P(\Phi)\geq c}\) . A term is an expression of the form d 1 P1) + ... + d n P n ) where d 1,...,d n are real numbers and \({n\geq 1}\) . Finally a probability formula is a statement of the form \({t\geq c}\) where t is a term and c is a real number. We call formulas of the form BelΦ belief formulas, formulas of the form Test(φ) test formulas and formulas of the form Datum(φ) perception formulas. BelΦ reads “the agent believes that Φ”; \({P(\Phi)\geq c}\) reads “the agent assigns to the fact φ at least probability c”; [α]Φ reads “always if the agent performs action α then Φ holds after α’s occurrence”; Test(φ) reads “φ is the representation that the agent is scrutinizing”; Datum(φ) reads “φ is a datum perceived by the agent.”

A propositional formula φ such that Test(φ) should be considered the content of the expectation that the agent is actually scrutinizing and comparing and matching with the incoming input data. On the other hand, propositional formula φ such that Datum(φ) should be considered a datum obtained by the agent’s sensors. With perceptual datum we mean here something similar to the notion of datum given in Rescher (1976). A perceptual datum is in our vocabulary some piece of information gathered by the agent’s sensors which is a candidate for becoming a belief of the agent. Footnote 5 It will be shown below that both perceptual data and scrutinized representations play a crucial role within the surprise processing.

Moreover, we use the following abbreviations:

$$ \left\langle \alpha\right\rangle \Phi=_{\rm def}\neg\left[\alpha\right]\neg\Phi; $$
$$ \sum_{i=1}^{n}d_{i}P(\Phi_{i})\geq c=_{\rm def}d_{1}P(\Phi_{1})+\cdots+d_{n}P(\Phi_{n})\geq c $$
$$ d_{1}P(\Phi_{1})\geq d_{2}P(\Phi_{2})=_{\rm def}d_{1}P(\Phi_{1})-d_{2}P(\Phi_{2})\geq0 $$
$$ \sum_{i=1}^{n}d_{i}P(\Phi_{i})\leq c=_{\rm def}\sum_{i=1}^{n}-d_{i}P(\Phi_{i})\geq-c $$
$$ \sum_{i=1}^{n}d_{i}P(\Phi_{i}) < c=_{\rm def}\neg\left(\sum_{i=1}^{n}d_{i}P(\Phi_{i})\geq c\right) $$
$$ \sum_{i=1}^{n}d_{i}P(\Phi_{i}) > c=_{\rm def}\neg\left(\sum_{i=1}^{n}d_{i}P(\Phi_{i})\leq c\right) $$
$$ \sum_{i=1}^{n}d_{i}P(\Phi_{i})=c=_{\rm def}\sum_{i=1}^{n}d_{i}P(\Phi_{i})\leq c\wedge\sum_{i=1}^{n}d_{i}P(\Phi_{i})\geq c $$

2.2 Semantics

We define \({\mathbf{M}}\) as the class of models of the form M = 〈W,B,R 0,R 1,R 2,P,TEST,DATA,π 〉 where each element of the tuple is defined as follows.

  • W = { w,w′,w′′,...} is a non-empty set of possible worlds.

  • B is a mapping \({B:W\longrightarrow2^{W}}\) associating a set of possible world B(w) to each possible world w. The elements in B(w) are the alternative (worlds) that the agent considers possible at world w.

  • R0, R1, R2 are mappings

    1. 1.

      \({R_{0}:AT\longrightarrow(W\longrightarrow2^{W})}\)

    2. 2.

      \({R_{1}:OBS\longrightarrow(W\longrightarrow2^{W})}\)

    3. 3.

      \({R_{2}:RETR\longrightarrow(W\longrightarrow2^{W})}\)

    associating sets of possible worlds R a0 (w), R observe(φ)1 (w) and R retrieve(φ)2 (w) to each possible world w. Those worlds w′ such that w′∈R a0 (w), w′∈R observe(φ)1 (w) and w′∈R retrieve(φ)2 (w) are respectively those worlds which are achievable from w by doing the atomic action a, achievable by doing the action of perceiving φ and achievable by doing the operation of retrieving the expectation that φ from the background level.

  • P is a function which associates with each world w in W a probability space P(w) = (W w ,X w ) where:

    • \({W_{w}\subseteq W}\) is called sample space;

    • X w is a probability function defined on W w such that \({X_{w}:W_{w}\longrightarrow\left[0,1\right]}\) and

    $$ \forall w\in W \sum_{w'\in W_{w}}X_{w}(w')=1 $$
  • TEST is a (test) function \({TEST:W\longrightarrow PROP}\)Footnote 6 which assigns a propositional formula to each possible world. This function returns the representation that the agent is scrutinizing at a certain world, i.e. the representation on which the agent is focusing its attention and that the agent matches with the perceptual data.

  • DATA is a (perception) function \({DATA:W\longrightarrow PROP}\) which assigns a propositional formula to each possible world. The function returns the datum obtained by the agent ’s sensors at world w.

  • \({\pi:\Pi\longrightarrow2^{W}}\) assigns a set of worlds to each propositional variable.

Here we suppose that B, TEST, DATA, every R a0 , every R observe(φ)1 and every R retrieve(φ)2 are partial functions.

We use the following two notational abbreviations:

  • (Domain): \({||\Phi||^{W_{w}}}={\left\{ w'\in W_{w}|M,w'\vDash\varphi\right\} }\);

  • (Probability of a Domain): \({X_{w}(||\Phi||^{W_{w}})=\sum_{w'\in||\Phi||^{W_{w}}}X_{w}(w')}\) .

2.2.1 Truth conditions

  • \({M,w\vDash p \Longleftrightarrow w\in\pi(p)}\)

  • \({M,w\vDash\neg\Phi \Longleftrightarrow} \hbox{ not } {M,w\vDash\Phi}\)

  • \({M,w\vDash\Phi_{1}\wedge\Phi_{2} \Longleftrightarrow M,w\vDash\Phi_{1}} \hbox{ and } M,w\vDash\Phi_{2}\)

  • \({M,w\vDash Bel\Phi \Longleftrightarrow \forall w'} \hbox{ if } w'\in B(w) \hbox{ then }{M,w'\vDash\Phi}\)

  • \({M,w\vDash d_{1}P(\Phi_{1})+\cdots+d_{n}P(\Phi_{n})\geq c \Longleftrightarrow d_{1}X_{w}(||\Phi_{1}||^{W_{w}})+\cdots+d_{n}X_{w}(||\Phi_{n}||^{W_{w}})\geq c}\)

  • \({M,w\vDash Test(\varphi) \Longleftrightarrow \varphi=TEST(w)}\)

  • \({M,w\vDash Datum(\varphi) \Longleftrightarrow \varphi=DATA(w)}\)

  • \({M,w\vDash\left[\alpha\right]\Phi \Longleftrightarrow \forall w'} \hbox{ if }w'\in R^{\alpha}(w) \hbox{ then }{M,w'\vDash\Phi}\)

where Rα(w) is defined according to the following (1), (2), (3) and (4).

  1. 1.

    Ra(w) = R a0 (w);

  2. 2.

    Robserve(φ)(w) = R observe(φ)1 (w);

  3. 3.

    Rretrieve(φ)(w) = R retrieve(φ)2 (w);

  4. 4.

    \({R^{\alpha;\beta}(w)=(R^{\beta}\circ R^{\alpha})(w)}\) .

2.3 Basic properties and definitions

For what concerns the probabilistic fragment of our logic we inherit all axioms and inference rules given in Halpern (2003) and Fagin and Halpern (1994). Soundness and completeness of this deductive system for a logic of belief and probability has been proved. The axiom system is made of three different kinds of axioms and inference rules. Axioms and inference rules are given for: (1) propositional reasoning and Bel modal operator Footnote 7; (2) for reasoning about probability Footnote 8; (3) for reasoning about linear inequalities Footnote 9.

Moreover, we suppose here that believing that Φ holds implies that the maximum value of probability is assigned to Φ. Formally:

$$ (Incl_{Bel/Prob}) Bel\Phi\rightarrow(Prob(\Phi)=1). $$

This axiom requires that the set of worlds which are considered possible by the agent in an arbitrary world w is a superset of the sample space with respect to the arbitrary world w:

  • for every \({w\in W W_{w}\subseteq B(w)}\).

With respect to the dynamic component we use standard axioms from dynamic logic. We take the axioms and inference rules of the basic normal modal logic for the dynamic operator and the standard axiom for sequential composition:

  • \({(K)\left[\alpha\right](\Phi\rightarrow\Psi)\wedge\left[\alpha\right]\Phi\rightarrow\left[\alpha\right]\Psi }\)

  • ([α]−Necessitation) From \({\vdash\Phi}\) infer \({\vdash\left[\alpha\right]\Phi}\)

  • \({(Composition) \left[\alpha\right]\left[\beta\right]\Phi\longleftrightarrow\left[\alpha;\beta\right]\Phi}\) .

Moreover, we suppose the following axioms for atomic actions and perceptual actions:

  • \({(Det_{At}) \left\langle a\right\rangle \Phi\rightarrow\left[b\right]\Phi}\)

  • \({(Perc_{1}) \varphi\longleftrightarrow\left\langle observe(\varphi)\right\rangle \top}\)

  • \({(Perc_{2}) \left[observe(\varphi)\right]Datum(\varphi)}\)

  • \({(Perc_{3}) \left\langle observe(\varphi)\right\rangle \Phi\rightarrow\left[observe(\varphi)\right]\Phi}\) .

(Perc 1) says that: (1) it is always possible for the agent to perceive φ if φ is true in the external world and (2) if it is possible for the agent to perceive φ then φ is true in the external world. (Perc 2) says that after φ is perceived, φ becomes a perceptual datum that is the action of perceiving φ moves a new datum φ into the data space of the agent. (Perc 3) guarantees that perceptual actions are deterministic. (Det At ) guarantees that atomic actions follow the same path.

We note that the previous axioms correspond to the following semantic constraints:

  • for every \({w\in W \hbox{ if } w'\in R_{0}^{a}(w) \hbox{ and }w''\in R_{0}^{b}(w) \hbox{ then }w'=w''}\) ;

  • for every \({w\in W R_{1}^{observe(\varphi)}(w)\neq\emptyset} \hbox{ if and only if }{M,w\vDash\varphi};\)

  • for every \({w\in W \hbox{ if } w'\in R_{1}^{observe(\varphi)}(w) \hbox{ then } \varphi=DATA(w')}\) ;

  • for every \({w\in W \hbox{ if } w'\in R_{1}^{observe(\varphi)}(w) \hbox{ and } w''\in R_{1}^{observe(\varphi)}(w) \hbox{ then } w'=w''}.\)

Finally, we suppose that the following are valid properties of retrieve mental operations:

  • \({(Retr_{1}) \left\langle retrieve(\varphi)\right\rangle \top\rightarrow\neg Test(\varphi)}\)

  • \({(Retr_{2}) \left[retrieve(\varphi)\right]Test(\varphi)}\)

  • \({(Retr_{3}) \left\langle retrieve(\varphi)\right\rangle \Phi\rightarrow\left[retrieve(\varphi)\right]\Phi}\).

(Retr 1) says that if it is possible for the agent to retrieve φ then φ is a representation which is not actually scrutinized. (Retr 2) says that after φ gets retrieved, φ is scrutinized by the agent. Thus the mental operation of retrieving φ has the function of modifying the mental setting of the agent, by moving a new representation φ into the test space of the agent. Finally, (Retr 3) guarantees determinism for retrieve mental operations. We note that the previous axioms correspond to the following semantic constraints:

  • for every \({w\in W \hbox{ if } \varphi=TEST(w) \hbox{ then } R_{2}^{retrieve(\varphi)}(w)=\emptyset}\) ;

  • for every \({w\in W \hbox{ if } w'\in R_{2}^{retrieve(\varphi)}(w) \hbox{ then } \varphi=TEST(w')}\) ;

  • for every \({w\in W \hbox{ if } w'\in R_{2}^{retrieve(\varphi)}(w) \hbox{ and }w''\in R_{2}^{retrieve(\varphi)}(w) \hbox{ then } w'=w''}.\)

We call SURPRISE the logic axiomatized by the axioms and inference rules for probabilities and beliefs given in Halpern (2003) and Fagin and Halpern (1994) and discussed above, the axiom Incl Bel/Prob, the previous axioms and inference rules for actions in general, and the special axioms for atomic actions, perceptual actions and retrieve mental operations. We call SURPRISE models the set of models \({\mathbf{M}_{Surp}\subseteq\mathbf{M}}\) satisfying all the semantic constraints imposed in this section and write ⊧ Surp φ if φ is valid in all SURPRISE models. Moreover, we write \({\vdash_{Surp}\varphi}\) if φ is a theorem of SURPRISE.

Having defined retrieve mental operations, and formulated their properties we can characterize the notion of background expectation (or background belief). A background (or passive) expectation is in our vocabulary an expectation whose content is available and accessible by means of a retrieve mental operation, that is a background expectation is an expectation whose content can be mentally retrieved. Formally:

$$ Background(\varphi)=_{\rm def}\left\langle retrieve(\varphi)\right\rangle \top. $$

The present distinction between expectations and beliefs under scrutiny of the form Test(φ) and background expectations and beliefs of the form Background(φ) looks similar to the distinction given in psychology between active expectations and passive expectations (Tversky and Koehler 1994; Kahneman and Tversky 1982). According to Kahneman and Tversky the former “occupy consciousness and draw on the limited capacity of attention”; the latter kind are “automatic and effortless.” Passive expectations can be either permanent, such as categories and assumptions about the external world, or temporary, such as the priming effects in psychological experiments. Footnote 10

3 Kinds of surprise

3.1 Mismatch-based surprise and astonishment

It is the objective of having an operational and cognitively plausible model of surprise which gives rise to the need to introduce and exploit the notion of representation under scrutiny (representation to be tested). Indeed we want to model realistic cognitive systems which process input data and which are focused on a small portion of their internal information state. The purpose of this section is to clarify the distinction between mismatch-based surprise (there is a recognized conflict between the agent’s input data and the agent’s representation under scrutiny) and astonishment. While the notion of mismatch-based surprise is an operational notion and is associated with a recognized logical conflict between the incoming information and a representation under scrutiny, astonishment is in our view the response to the recognized implausibility of the input data. When I am astonished about something, I cannot believe what I see and this presupposes that I’m trying to believe, I’m trying to find an explanation for what I see, but I’m suspended. Astonishment seems to be due to a difficulty, to a delay due to this process of integration, of accounting for, which in this case is not automatic and fast, not immediately successful. We cannot in fact believe something by just putting it in our belief base; we must check about consistency (especially if there are reasons for suspecting some inconsistency). If the actual input generates an intense astonishment then it means that the input is unexpected and rather unpredictable from my actual beliefs. If I have to accept it, I have to adjust my beliefs in such a way that they can account for this unexpected event. Generally, in order to cope with an intense astonishment, I need a deep and large revision of my well-consolidated beliefs.

Example 1

Consider a person being terribly late. He needs to take a train from Florence to Rome at 8:00 a.m. It is 7:56 a.m. and the guy is still running to reach the Florence station. Finally, he arrives at the station at exactly 8:00 a.m. and checks whether the train for Rome is standing in the station. At the moment of the perceptual test the agent has the representation of the train for Rome standing in the station explicit in his mind and attributes a high probability to this fact. When the agent perceives the train for Rome is not standing the agent gets very surprised since the incoming representation (logically) conflicts with the explicit representation of the train for Rome is standing in the station and the probability assigned to the fact the train for Rome is standing in the station is very high. This kind of surprise is what we call mismatch-based surprise.

Example 2

It is 5:50 p.m. and Bill is working in his office when Mary phones Bill and tells him: “I will come to your office at 6 p.m.! Wait for me there!” After Mary’s call Bill decides to stop working and to rest until Mary will arrive. Bill expects with high probability that Mary will knock on the door of the office at 6 p.m. and focuses his attention on this. It is 5:53 p.m. and suddenly someone knocks on the door. Bill opens the door and sees that a policeman is standing in front of the door. There is not logical conflict between the scrutinized representation Mary will knock on the door of the office at 6 p.m. and the perceived fact a policeman knocks on the door of the office at 5:53 p.m (indeed Mary knocks on the door at 6 p.m. is not inconsistent with a policeman knocks on the door at 5:53 p.m). Thus there is not mismatch-based surprise. But Bill gets very astonished by perceiving the fact a policeman knocks on the door of the office at 5:53 p.m. Indeed Bill retrieves the information concerning a policeman knocking on the door of the office at 5:53 p.m. from his background knowledge and recognizes the implausibility of the perceived fact given what he knows (“I wouldn’t have expected to perceive a policeman knocking on the door of my office!”).

Let us consider more carefully the two notions of mismatch-based surprise and astonishment from a qualitative and quantitative point of view. We want to specify the mental configurations associated with these two emotional responses and to provide the criteria to quantify them (to measure their intensity).

Definition 1

Mismatch-based surprise (given the conflict between a perceived fact and a scrutinized representation). The cognitive configuration of mismatch-based surprise relative to the mismatch between a perceptual datum ψ and a scrutinized representation φ is defined by the following facts:

  • 1. ψ is the agent’s perceptual datum;

  • 2. φ is the representation scrutinized by the agent; and

  • 3. the agent believes that φ and ψ are incompatible facts.

Formally: \({MismatchSurprise(\psi,\varphi)=_{\rm def}Datum(\psi)\wedge Test(\varphi)\wedge Bel(\psi\rightarrow\neg\varphi).}\)

Definition 2

(Retrieval-based) Astonishment. The cognitive configuration of (retrieval-based) astonishment relative to a perceptual datum ψ is defined by the following facts:

  1. 1.

    ψ is the agent’s perceptual datum;

  2. 2.

    the agent can retrieve from its background knowledge either the expectation \({\neg\psi}\) or the expectation that ψ, that is, either the expectation that \({\neg\psi}\) or the expectation that ψ is “mentally” available at a background level.

Formally: \({Astonishment(\psi)=_{\rm def}Datum(\psi)\wedge(Background(\neg\psi)\vee Background(\psi))}^{11}\)

Definition 3

Intensity of mismatch-based surprise (given the conflict between a perceived fact and a scrutinized representation). The mismatch-based surprise relative to the mismatch between a perceptual datum ψ and a scrutinized representation φ has intensity equal to (or higher than) c if and only if the probability assigned to the scrutinized expectation that φ (invalidated by the perceived fact ψ) is equal to (or higher than) c.

Formally:

  • \({IntensityMismatchSurprise(\psi,\varphi)\geq c=_{\rm def}MismatchSurprise(\psi,\varphi)\wedge P(\varphi)\geq c}\)

  • \({IntensityMismatchSurprise(\psi,\varphi) > c=_{\rm def} MismatchSurprise(\psi,\varphi)\wedge P(\varphi) > c}\)

  • \({IntensityMismatchSurprise(\psi,\varphi)=c=_{\rm def}MismatchSurprise(\psi,\varphi)\wedge P(\varphi)=c}\) .

Definition 4

Intensity of (retrieval-based) astonishment. The (retrieval-based) astonishment relative to a perceptual datum ψ has intensity equal to (or higher than) c if and only if the probability assigned to \({\neg\psi}\) (the negation of the perceived fact) is equal to (or higher than) c.

Formally:

  • \({IntensityAstonishment(\psi)\geq c=_{def}Astonishment(\psi)\wedge P(\neg\psi)\geq c}\)

  • \({IntensityAstonishment(\psi) > c=_{def}Astonishment(\psi)\wedge P(\neg\psi) > c}\)

  • \({IntensityAstonishment(\psi)=c=_{def}Astonishment(\psi)\wedge P(\neg\psi)=c}.\)

According to Definitions 3 and 4 the intensity of (retrieval-based) astonishment is equal to the probability assigned to the opposite of the perceived fact (we can call it degree of unexpectedness of the perceived fact as in Ortony and Partridge, 1987) whereas the intensity of mismatch-based surprise is equal to the probability assigned to the formula invalidated by the perceived fact.

Let us discuss some formal properties of (retrieval-based) astonishment and mismatch-based surprise.

Proposition 1

Footnote 11

  • \({\models_{Surp}IntensityMismatchSurprise(\psi,\varphi)=c\wedge(Background(\psi)\vee Background(\neg\psi))} {\rightarrow IntensityAstonishment(\psi)\geq c}\)

The previous proposition says that if the agent is surprised by the mismatch between the perceptual datum ψ and a scrutinized expectation that φ and this surprise has intensity c then if the agent has either a background available expectation that ψ or a background available expectation that \({\neg\psi}\) then the intensity of astonishment is higher than c. Therefore (retrieval-based) astonishments are by nature more intense than mismatch-based surprises. The reader should also note that the two dimensions of surprise are not necessarily complementary (the sum of the two is not necessarily equal to 1). Indeed I could be surprised with intensity 0.5 by the mismatch between the perceptual datum ψ and the scrutinized expectation that φ and be astonished with intensity 0.7 by the recognized implausibility of ψ. Thus the two kinds of surprise are both qualitatively and quantitatively different.

Often mismatch-based surprise and (retrieval-based) astonishment occur together after having perceived a certain fact ψ. According to proposition 1 the intensity of (retrieval-based) astonishment is higher than the intensity of mismatch-based surprise. Consider the next scenario.

Example 3

Imagine a person walking along the Thames. The person is scrutinizing whether there is the tower of London (φ) and is attributing a high probability to this fact. Suddenly the person turns the eyes toward the river and perceives there is a whale (ψ) (see the recent facts in London). The person gets surprised because of the recognition of the incompatibility between ψ and φ. Indeed the person believes that \({\psi\rightarrow\neg\varphi}\). But he also gets highly astonished. Indeed the person recognizes (after having retrieved from his background knowledge the information about ψ) the implausibility of the fact there is a whale (or even stronger the impossibility of the fact there is a whale). The intensity of the astonishment is equal to the probability assigned to \({\neg\psi}.\)

3.2 Inference-based astonishment

In the previous paragraph we have defined astonishment as the kind of surprise which involves a recognized implausibility of a perceived fact φ. We have assumed that the recognition of implausibility of the perceived fact φ is based on the mental availability of either the expectation that φ or the expectation that \({\neg\varphi}\). Indeed according to Definition 2 retrieval-based astonishment concerns those background passive expectations that the agent can retrieve from the background level. As noticed by Ortony and Partridge (1987) surprise can also arise from an inconsistency between an implicit passive expectation and the input proposition. With implicit expectations they mean all those facts that can be inferred from the explicit beliefs by few and simple deductions (see Fig. 1).

We think that Ortony and Partridge’s distinction is relevant for a model of surprise and that in order to implement it formally we should relax the assumption of logical omniscience of the agent. In order to do it formally we should identify in the complete set of beliefs a subset of this and call it the set of explicit beliefs (or belief base as in the tradition of belief revision Footnote 12). This is the set of beliefs that the agent can use to make inferences and which is not closed under classical inference. Footnote 13 Given the set of explicit beliefs we could define implicit (passive) beliefs as all those beliefs which can be inferred from the elements of the belief base (and which are not members of the belief base).

Having defined a set of explicit beliefs and a set of implicit beliefs, we can make more precise our definition of astonishment. Indeed we can account for the astonishment due to a recognized conflict between a post-hoc belief or expectation (a belief which is inferred from the explicit beliefs and which was implicit before the perception) and the incoming input data: we call it inference-based astonishment.

Since the distinction between explicit and implicit belief is not formally specified under the present analysis we only give here a verbal characterization of inference-based astonishment.

Definition 5

(Inference-based) Astonishment. The cognitive configuration of (inference-based) astonishment relative to a perceptual datum ψ is defined by the following facts:

  1. 1.

    ψ is the agent’s perceptual datum (something perceived by the agent);

  2. 2.

    the agent can infer and effectively infer \({\neg\psi}\) from its explicit beliefs (when \({\neg\psi}\) was the content of an implicit belief before the perception).

We should also consider for completeness all cases of post-hoc reconstruction of the probability of the perceived event. This would allow us to generalize Definition 5. In those cases while attempting to assimilate/integrate the perceived datum the agent “derives” that the event is not so probable (this is different from inferring some fact which is incompatible with the perceived fact). While asking himself: was this unpredicted event/datum predictable? it reconstructs the probability of the event and concludes that “I would never had expected that.” Therefore the intensity of inference-based astonishment relative to the perceived fact ψ must depend on the probability assigned to \({\neg\psi}\) (the higher the probability assigned to \({\neg\psi}\), the more intense the astonishment).

We have provided two different notions of astonishment. On the one side (Definition 2) after perceiving ψ there is a simple retrieval of either the expectation that ψ or the expectation that \({\neg\psi}\) when either the expectation that ψ or the expectation that \({\neg\psi}\) is mentally available at the background (passive) level. On the other side (Definition 5) either the negation of the perceived fact is inferred from the explicit beliefs or there is a post-hoc reconstruction of the probability of the perceived fact (a probabilistic inference). In both cases some mental operation must be done in order to make the agent aware of the implausibility of the perceived fact.

We conclude this section by summarizing our basic ontology of on-line surprise (whose cognitive configuration is obtained during the perceptual phase and before an eventual belief reconsideration). In our view at least three species must be considered: surprise based on the mismatch between a representation under scrutiny and an incoming input (Definition 1), retrieval-based astonishment (Definition 2) and inference-based astonishment (Definition 5).

3.3 Some comments

Let us stress more in this section the main differences between our approach and Ortony and Partridge’s approach by making explicit the main important issues that are neglected in their model and that our model tries to clarify.

Ortony and Partridge’s model does not capture in our view the important distinction between the previous two kinds of astonishments (retrieval-based astonishment and inference-based astonishment). Ortony and Partridge’s model is only focused on inference-based astonishment and completely neglects to account for the other important kind.

In inference-based astonishment, the subject did not in fact derive the prediction/expectation that \({\neg\varphi}\) before perceiving φ (the prediction is just potential and implicit in its mind). While attempting to assimilate/integrate the new data he infers from his explicit beliefs the opposite. Therefore the mental operation involved in this kind of astonishment is an inferential action Footnote 14: it transforms some potential and implicit expectation (or belief) into an explicit and scrutinized one. This is exactly the content of the informal Definition 5 given in Sect. 3.2.

In retrieval-based astonishment on the contrary, when perceiving φ a pre-existent expectation that φ (or a pre-existent expectation that \({\neg\varphi}\)) is available (it can be retrieved from the background level even without a constructive inferential process). Indeed in our view an agent has always a certain number of accessible beliefs and expectations in the background (at an unconscious and automatic level) and these expectations and beliefs in the background must be distinguished from the representation under scrutiny formally identified as a test formula (see Fig. 1 and Sect. 2.3). When perceiving φ, retrieval-based astonishment may simply arise from the automatic retrieval of either the background probabilistic expectation that φ or the background probabilistic expectation that \({\neg\varphi}\). Therefore the mental operation involved in retrieval-based astonishment is a retrieve mental operation which transforms some background expectation (or belief) into a scrutinized expectation. This is exactly the content of the formal Definition 2 given in Sect. 3.1.

In our view Ortony and Partridge’s model does not capture this distinction between (1) surprise arising from the recognition of the implausibility of the perceived fact due to an inferential process from my explicit beliefs and (2) surprise arising from the recognition of the implausibility of the perceived fact due to a retrieval of a background expectation. The incompleteness of Ortony and Partridge’s model is due to the lack of distinction between background expectations and representations on the one side and implicit expectations and beliefs on the other side (indeed they only account for the second kind). This distinction is relevant in our approach and it gives us the possibility to articulate a richer typology of surprise.

Moreover, in our model there are two parallel components and paths for surprise, and there are two parameters that we should take into account in order to quantify surprise (see Fig. 2 below).

  1. (i)

    I can have an expectation under scrutiny whose content is φ (the expected event or entity): when this prediction is invalidated, happens to be wrong, this means that I perceive something different. In other words there is an input datum ψ mismatching with φ. Even nothing is something: also the absence of any object when I was expecting and scrutinizing φ, that is the fact that φ does not happen (\({\neg\varphi}\)), is in any case an unpredicted/unexpected input datum which invalidates the representation under scrutiny φ.

  2. (ii)

    Having perceived ψ, the expectation that ψ (or the expectation that \({\neg\psi}\)) is available at the background and unconscious level (or the expectation that \({\neg\psi}\) is inferred from explicit beliefs and expectations).

Fig. 2
figure 2

Surprise processing

We claim that, on the one side, surprise is a function of the probability of the invalidated fact under scrutiny (φ); while on the other side it is a function of the probability of the perceived fact ψ. On the one side, the more certain was my scrutinized expectation, the more probable is φ, the more surprised I amFootnote 15 (see Definition 3 in Sect. 3.1). On the other side, the more unpredictable, the more unexpected ψ (the more expected \({\neg\psi}\)), the more astonished I am (see Definition 4 in Sect. 3.1 as well as the generalization of Definition 5 which deals with probabilistic inference). To distinguish these two facets, components, and processes we have proposed to use for the former case the term mismatch-based surprise (the signal of the invalidation of the expectation under scrutiny), and the term astonishment (either retrieval-based astonishment or inference-based astonishment) for the latter case. Ortony and Partridge seem to consider surprise only the second phenomenon and path. Indeed according to their model intensity of surprise only depends on the degree of “unexpectedness” of the perceived fact. Footnote 16

But the surprise processing does not involve the two paths. Indeed one can be surprised by some perceived fact ψ one did not expect without having to expect and test something else which is evaluated to be incompatible with ψ: astonishment does not necessarily presuppose mismatch-based surprise. Moreover, one can be surprised by some perceived fact ψ which is evaluated to be incompatible with some scrutinized fact φ without having to be astonished by the recognized implausibility of ψ: not necessarily a mismatch-based surprise entails an astonishment as a felt reaction.

4 Surprise and belief change

As some psychologists have stressed (Meyer et al. 1991, 1997), surprise can culminate in a process of belief change. The aim of the following analysis is to suggest some interesting ways a formal model of cognitive surprise can be integrated with a formal model of cognitive belief change.

Formal approaches to belief revision are mainly interested in finding rationality principles and postulates driving belief change: this is for instance the main purpose of the classical AGM theory (Alchourron et al. 1985). All those models implicitly assume that when the agent perceives some new fact φ the perception is always a precursor of a belief change with φ. Thus the main problem with AGM theory is a missed identification of the precursors of belief change.

Our attempt here is to clarify under what conditions belief revision should be triggered after having perceived a certain fact. We claim that surprise plays a crucial role in triggering this kind of process and that it is implausible to assume that realistic cognitive agents revise their beliefs with φ every time they perceive a new fact φ. Realistic and non-omniscient cognitive agents are situated in complex environments where many tasks must be solved. Since accurate belief revision and update require time and considerable computational costs, realistic cognitive agents need some mechanism which is responsible:

  1. 1.

    for signaling the global inconsistency of the knowledge base with respect to the incoming input and

  2. 2.

    for the revision of beliefs and expectations of the agent.

One of the adaptive functions of surprise is exactly this.

Belief change in cognitive agents is triggered by a very surprising incoming input. The intensity of surprise relative to the incoming input “signals” to the agent that things are not going as expected and that the knowledge of the environment must be reconsidered. Indeed wrong beliefs generally lead to bad performances and to failure in the intention and goal fulfillment.

On the other hand, resource-bounded cognitive agents do not generally reconsider their beliefs and expectations when the input data are not recognized to be incompatible or implausible with respect to their pre-existent knowledge. Indeed it is not convenient for the survival of the agent to update or reconsider beliefs every time a new fact is perceived. When the world flows as expected and we are not aware of the inadequacy of our knowledge of the world, we do not need to criticize and reconsider this knowledge. Indeed reconsidering beliefs after every perception would strongly interfere with the agent’s ongoing performance and would continuously divert its attention away from its intentionally driven activity.

A model of cognitive belief change should be able to account for this trade-off between extensive belief change triggered by surprise and belief change avoidance when perception does not generate surprise.

4.1 Dealing with unexecutable updates

In Kooi (2003), a combination of the dynamic epistemic logic of Gerbrandy (Gerbrandy 1999; Gerbrandy and Groeneveld 1997) with the probabilistic logic of Fagin and Halpern is given. This combination results in a probabilistic dynamic epistemic logic where it is possible to talk about beliefs and probabilities as well as information change for beliefs and probabilities. In this probabilistic extension of Gerbrandy’s logic of information update the symbol φ! is introduced. φ! is the process of updating beliefs with an arbitrary sentence φ. Footnote 17

The aim of this section is to suggest a way to modify the framework given in Kooi (2003), Gerbrandy (1999), Gerbrandy and Groeneveld (1997) in order to investigate the role of surprise in information change.

In order to do this we must import update processes into our formal language \({\mathcal{L}_{\mathcal{SURP}}}\).

UPD is the set of update processes defined as the smallest set such that:

  • if \({\varphi \in PROP \hbox{ then }\varphi!\in UPD}\) .

We call \({\mathcal{L}_{\mathcal{SURP}+}}\) the new extended language with update processes. The new language \({\mathcal{L}_{\mathcal{SURP}+}}\) is given by the following rule in extended Backus–Naur form:

$$\Phi::=p|\neg\Phi|\Phi_{1}\wedge\Phi_{2}|Bel\Phi|\left[\alpha\right]\Phi|d_{1}P(\Phi_{1})+\cdots+d_{n}P(\Phi_{n})\geq c|Test(\varphi)|Datum(\varphi)|\left[\varphi!\right]\Phi $$

where p∈Π, φ∈PROP, α∈ACT, d 1,...,d n ,c are real numbers and φ!∈UPD.

The semantics of formulas in \({\mathcal{L}_{\mathcal{SURP}+}}\) is the same semantics given for formulas in \({\mathcal{L}_{\mathcal{SURP}}}\) (see Sect. 2.2). We only need to provide a semantics for formulas of the form [φ!]Φ. This is given next. Footnote 18

  • \({M,w\vDash\left[\varphi!\right]\Phi \Longleftrightarrow \forall(M',w')}\quad \hbox{ if } (M',w')\in R^{\varphi!}(M,w) \hbox{ then } {M',w'\vDash\Phi}\)

We suppose that R φ!(M,w) is defined according to the following Definition 6.

Definition 6

Given a model M = 〈W,B,R 0,R 1,R 2,P,TEST,DATA,π 〉, a world wW and a propositional formula φ∈PROP we suppose that

$$\hbox{EITHER} \;R^{\varphi!}(M,w)=\emptyset \hbox{ OR } R^{\varphi!}(M,w)=(\widetilde{M^{\varphi}},\widetilde{w^{\varphi}}).$$

Moreover, we suppose that \({\widetilde{M^{\varphi}}}\) and \({\widetilde{w^{\varphi}}}\) are defined as follows.

  1. 1.

    \({\widetilde{w^{\varphi}}=w}.\)

  2. 2.

    \({\widetilde{M^{\varphi}}=\langle W,\widetilde{B^{\varphi}},R_{0},R_{1},R_{2},\widetilde{P^{\varphi}},TEST,DATA,\pi \rangle}\) where \({\widetilde{P^{\varphi}}(w)=(\widetilde{W_{w}^{\varphi}},\widetilde{X_{w}^{\varphi}})}\) and \({\widetilde{W_{w}^{\varphi}}}\), \({\widetilde{X_{w}^{\varphi}}}\), \({\widetilde{B^{\varphi}}}\) are defined according to the following (a), (b) and (c).

    1. (a)

      for all \(w\in W: {\widetilde{B^{\varphi}}(w)=\left\{ w'|w'\in B(w) \hbox{ and } M,w'\models\varphi\right\} }\).

    2. (b)

      for all \(w\in W: {\widetilde{W_{w}^{\varphi}}=\left\{ w'\in W_{w}|M,w'\models\varphi\right\} }\) .

    3. (c)

      for all wW and \({w'\in\widetilde{W_{w}^{\varphi}}}: {\widetilde{X_{w}^{\varphi}}(w')=\frac{X_{w}(w')}{X_{w}(||\varphi||^{W_{w}})}}.\)

According to Definition 6 updating with φ either cannot be performed or yields an updated model \({\widetilde{M^{\varphi}}}\) which differs from the original model only with respect to the accessibility relations for the Bel modal operator and the probability functions. When an update with φ is successfully performed the original model M is transformed into the updated model \({\widetilde{M^{\varphi}}}\) in such a way that for all worlds w all alternatives that an agent considers possible where φ does not hold are removed and worlds where φ does not hold are removed from the sample space W w . Moreover, for all worlds w the probability function is redefined according to Condition 2(c).

In Kooi (2003), Gerbrandy and Groeneveld (1997) and Gerbrandy (1999), it is assumed that an update with φ is always executable i.e. the authors suppose that R φ!(M,w) is never empty and always yields the updated model \({\widetilde{M^{\varphi}}}\). Thus these theories of information update assume that the formula \({\left\langle \varphi!\right\rangle \top}\) is valid. Here we suppose that belief update is triggered only under certain specific preconditions. This implies that in our view R φ!(M,w) may be empty (see Definition 6). This is the most striking difference between our version of belief update and standard versions of it. We will try to characterize the necessary preconditions for belief update in the next section and to investigate the role of surprise in the process.

The first relevant aspect to verify is whether the previous model transformation guarantees that the semantic constraints given in Sect. 2.3 are preserved. This is indeed the case.

Lemma 1

IfMis a SURPRISE model then\({\widetilde{M^{\varphi}}}\)is a SURPRISE model too.

Let us suppose in a way similar to Kooi (2003) that an update with φ can be performed only if the agent does not assign zero probability to φ. This assumption is made explicit in our framework by the next postulate:

$${(NotZero_{Upd})\left\langle \varphi!\right\rangle \top\rightarrow P(\varphi) > 0}.$$

This property corresponds to the following semantic constraint:

  • for every \(w\in W \hbox{ if } {R^{\varphi!}(M,w)\neq 0} \hbox{ then } {X_{w}(||\varphi||^{W_{w}}) > 0}.\) Footnote 19

You should notice that under this requirement formula \({Bel\varphi\rightarrow\left[\neg\varphi!\right]\bot}\) becomes valid, that is if an agent believes that φ holds then an update with φ cannot be executed. This implies that under the present framework belief revision with inconsistent information is left unspecified. Footnote 20

We also postulate that an agent has always epistemic access to all executable updates, that is, if an update with sentence φ is executable then the agent believes that the update with φ is executable. Formally:

$${(Access_{Upd})\left\langle \varphi!\right\rangle \top\rightarrow Bel\left\langle \varphi!\right\rangle \top}.$$

This property corresponds to the following semantic constraint:

  • for every wW if there is a w′ such that w′∈B(w) and R φ!(M,w′) = 0 then R φ!(M,w) = 0.

Before starting to investigate some formal consequences of our definition of belief update we provide the following definition of objective formula.

Definition 7

We define the set of objective formulas OBJ = {o 1,o 2,...} as the smallest set such that:

  • if φ∈PROP then φ∈OBJ (propositional formulas are objective formulas);

  • if φ∈PROP then Test(φ) and \({\neg Test(\varphi) \in OBJ}\) (test formulas and negations of test formulas are objective formulas);

  • if φ∈PROP then Datum(φ) and \({\neg Datum(\varphi) \in OBJ}\) (perception formulas and negations of perception formulas are objective formulas);

  • if o 1OBJ and α∈ACT then [α]o 1 and 〈α〉o 1OBJ.

We can now prove that the principles summarized in the following Lemma 2 are sound given the semantics of update processes (Definition 6).

Lemma 2

  • \({(Upd_{1})\left\langle \varphi!\right\rangle \Phi\rightarrow\left[\varphi!\right]\Phi}\)

  • \({(Upd_{2})o_{1}\rightarrow\left[\varphi!\right]o_{1}}\) where o 1 is an objective formula

  • \({(Upd_{3})Bel(\varphi\rightarrow\left\langle \varphi!\right\rangle \Phi)\rightarrow\left[\varphi!\right]Bel\Phi}\)

  • \({(Upd_{4})\left\langle \varphi!\right\rangle Bel\Phi\rightarrow Bel(\varphi\rightarrow\left[\varphi!\right]\Phi)}\)

  • \({(Upd_{5})(\sum_{i=1}^{n}d_{i}P(\varphi\wedge\left\langle \varphi!\right\rangle \Phi_{i})\geq cP(\varphi))\rightarrow\left[\varphi!\right](\sum_{i=1}^{n}d_{i}P(\Phi_{i})\geq c)}\)

  • \({(Upd_{6})(\left\langle \varphi!\right\rangle (\sum_{i=1}^{n}d_{i}P(\Phi_{i})\geq c)\rightarrow(\sum_{i=1}^{n}d_{i}P(\varphi\wedge\left[\varphi!\right]\Phi_{i})\geq cP(\varphi))}\)

  • \({(Upd_{7})\left[\alpha\right]\left\langle \varphi!\right\rangle \Phi\rightarrow\left[\varphi!\right]\left[\alpha\right]\Phi}\)

(Upd 1) establishes that belief updates are deterministic. According to (Upd 2) the truth value of an objective formula does not change after a belief update. (Upd 3), (Upd 4), (Upd 5) and (Upd 6) describe how beliefs and probabilities change after an update. According to (Upd 7) the effects of an update process on a model are independent of the fact that the update process may be executed after or before a sequence of actions (a sequence where each element is either an atomic action or a perceptual action or a retrieve mental operation).

Finally, we can precisely define our extended logic of surprise with update processes.

We call SURPRISE+ the logic axiomatized by the axioms and inference rules of the logic SURPRISE (see Sect. 2.3) plus the previous nine principles for update processes (Upd 1)−(Upd 7), (Access Upd ) and (NotZero Upd ). Moreover, we write \({\vdash_{Surp+}\varphi}\) if φ is a theorem of SURPRISE+.

We are able to prove by the seven principles summarized in Lemma 2 and the previous postulates Access Upd and Incl Bel/Prob (Sect. 2.3) that two compact reduction principles for beliefs and updates on the one side and probabilities and updates on the other side follow from the axiomatic system of our logic. These two principles are similar to Gerbrandy’s reduction principle for beliefs and updates (Gerbrandy and Groeneveld 1997; Gerbrandy 1999) and Kooi’s reduction principle for probabilities and updates (Kooi 2003). These results are summarized in the following theorem.

Theorem 1

  • \({(Upd_{8})\vdash_{Surp+}Bel(\varphi\rightarrow\left[\varphi!\right]\Phi)\vee\left[\varphi!\right]\bot\longleftrightarrow\left[\varphi!\right]Bel\Phi}\)

  • \({(Upd_{9}) \vdash_{Surp+}(\sum_{i=1}^{n}d_{i}P(\varphi\wedge\left[\varphi!\right]\Phi_{i})\geq cP(\varphi))\vee\left[\varphi!\right]\bot\longleftrightarrow\left[\varphi!\right](\sum_{i=1}^{n}d_{i}P(\Phi_{i})\geq c)}\)

Several interesting properties of update processes follow from Theorem 1 and the principles given in Lemma 2. Let us consider only some of them.

Proposition 2

  • \({(Upd_{10})\vdash_{Surp+}\left[\varphi!\right]Bel\varphi}\)

  • \({(Upd_{11})\vdash_{Surp+}Bel^{m}o_{1}\rightarrow\left[\varphi!\right]Bel^{m}o_{1}} \ for \ each \ m>0\)

  • \({(Upd_{12})\vdash_{Surp+}P(\varphi|\psi)=c\rightarrow\left[\psi!\right]P(\varphi)=c} where {P(\varphi|\psi)=\frac{P(\varphi\wedge\psi)}{P(\psi)}}\)

According to (Upd 10) after an update with φ the agent believes that φ holds. According to (Upd 11) for each m-level nested belief that o 1 holds (where o 1 is an objective formula), the m-level nested belief is preserved after a belief update. (Upd 12) shows the strong similarity between updating with propositional formulas in our framework and classical Bayesian updating. Footnote 21

4.2 Surprise-based belief update

We have noticed in the previous section that a relevant difference exists between the present approach to belief update and some standard approaches (Kooi 2003; Gerbrandy and Groeneveld 1997; Gerbrandy 1999). Differently from standard approaches we have supposed that belief update is triggered only under certain specific preconditions and is not always executable. The aim of this section is to characterize some of those necessary preconditions for belief update and to show that surprise plays a crucial role in triggering this mental process.

We begin with the formalization of our general intuition by supposing that two necessary preconditions for belief update are expressed by the following two additional principles (NecTrig1) and (NecTrig2).

$${(NecTrig1)\left\langle \varphi!\right\rangle \top\rightarrow Datum(\varphi)}$$

Footnote 22

$${(NecTrig2)\left\langle \varphi!\right\rangle \top\wedge Test(\psi)\rightarrow Bel(\varphi\rightarrow\neg\psi)\vee Background(\varphi)\vee Background(\neg\varphi)}$$

Footnote 23 According to principle (NecTrig1) an agent cannot update its beliefs with sentence φ unless φ is something that the agent has perceived (φ is a perceptual datum of the agent). According to principle (NecTrig2) if the agent is focused on the expectation that ψ then the agent cannot update its beliefs with φ unless either the agent recognizes a contradiction between φ and its scrutinized expectation that ψ or φ (or \({\neg\varphi}\)) is the content of an available background expectation. Both principles formally express the following postulate.

An agent can reconsider its previous knowledge with some piece of information φ only if:

  1. 1.

    φ is some piece of information that the agent has perceived and which is collected as a perceptual datum and

  2. 2.

    either the agent recognizes (is aware of) the contradiction and incompatibility between the perceptual datum φ (the object of its perception) and its scrutinized expectation or the probabilistic expectation that φ (or the expectation that \({\neg\varphi}\)) is (mentally) available at a background level.

Thus according to the previous postulate if an agent is not aware of the inconsistency between the perceptual datum φ and its actual scrutinized expectation that ψ and does not have access to the information concerning the plausibility of φ then the agent cannot revise its knowledge base on the basis of the perceptual datum.

The following example is given in order to defend the plausibility of the present postulate.

Example 4

Mary goes shopping downtown. She is looking for a nice pair of shoes for New Year’s Eve’s party. She remembers having heard from Bill that a well-stocked shoe shop has been opened in the main square of the town. Mary trusts Bill since she thinks that Bill always gives good advice. Thus she decides to reach the main square of the town in order to find the shoe shop. Now Mary expects that φ1 = she will find a shop selling a nice pair of shoes in the main square with high probability and focuses her attention on this. When walking toward the shop Mary observes φ2 = there is a Japanese restaurant in the corner of the street. Nevertheless Mary does not care about φ2. Indeed: (1) φ2 is not evaluated to be incompatible with φ1 and (2) Mary does not have a background available expectation that φ2 nor a background available expectation that \({\neg\varphi_{2}}\) which makes her able to recognize the implausibility of the perceived fact φ2. Thus Mary does not reconsider her knowledge base according to what she has perceived since both a recognition of implausibility of the perceived fact and a recognition of incompatibility between the perceived fact and the scrutinized expectation that φ1 are lacking.

Finally, Mary arrives at the main square of the town where she expects to find the shoe shop and to buy a nice pair of shoes. But Mary sees that no shop is there. Mary recognizes the inconsistency between her scrutinized expectation (φ1) and what is being perceived. Indeed there is not a shoe shop where she expected to find one. Since Mary is aware of the incompatibility between the perceived fact and her actual scrutinized expectation she can reconsider her belief base according to the perceptual datum.

Given the definitions of astonishment and mismatch-based surprise (Sect. 3.1) and supposing that the previous two principles (NecTrig1) and (NecTrig2) are added to our logic SURPRISE+ the following becomes a provable theorem.

Proposition 3

$${\vdash_{Surp+}\left\langle \varphi!\right\rangle \top\wedge Test(\psi)\rightarrow MismatchSurprise(\varphi,\psi)\vee Astonishment(\varphi) }$$

According to Proposition 3 if the agent is focused on the expectation that ψ then the agent cannot revise its knowledge base with the perceived fact φ unless either the agent gets surprised by the mismatch between the perceptual datum φ and the scrutinized expectation that ψ or the agent gets astonished by the recognized implausibility of φ. This proposition expresses a general cognitive principle: belief update with a perceived fact φ is triggered only if the agent is surprised or astonished by the perception of φ, that is

Some form of surprise is a necessary precondition for belief update.

This is for us a crucial principle for designing resource bounded cognitive agents which are focused on a small portion of their complete informational state and which need some mechanism for “signaling” that beliefs must be updated.

After having characterized two “necessary” preconditions for triggering belief update we move toward a brief investigation of the “necessary and sufficient conditions.” We only provide here some general intuitions about this issue.

It has been noticed by some psychologists (Meyer et al. 1997) that the triggering of a belief update process depends on the intensity of surprise associated with the perception of some fact φ: the higher the intensity of surprise relative to the perception of φ, the higher the probability that the agent will revise its knowledge with ψ.

In our view a first rough approximation of the necessary and sufficient preconditions for belief update is obtained by introducing the previous dimension: the intensity of surprise associated with the perception of φ.

We suggest the following as a plausible solution to the identification of the “necessary and sufficient” preconditions for belief update.

We establish that if the agent is scrutinizing the expectation that φ then it updates its belief base with ψ if and only if:

  • ψ is a perceptual datum and

  • either the agent gets surprised by the mismatch between the perceptual datum ψ and the scrutinized expectation that φ and the intensity of mismatch-based surprise exceeds a given threshold Δ or

  • the agent gets astonished by the recognition of implausibility of ψ and the intensity of astonishment exceeds threshold Δ.

We can express formally the previous principle.

$$(NecSuffTrig)\left\langle \varphi!\right\rangle \top\wedge Test(\psi)\longleftrightarrow Test(\psi)\wedge(IntensityMismatchSurprise(\varphi,\psi) > \Delta\vee IntensityAstonishment(\varphi) > \Delta) $$

Let us note two relevant facts. On the one hand we want to emphasize that both personality factors and motivational factors can affect the value of the threshold Δ and that the value of Δ changes due to the evolution and dynamics of goals and intentions. Since Δ has an intrinsic dynamic nature, its value is not in principle the same for all possible worlds w in a model. Nevertheless, it seems plausible to state that the higher is the motivational relevance of the perceived fact (more important is φ given actual goals and intentions of the agent) and the lower is the value of Δ. This implies that I am more prone to revise my knowledge when I perceive something which is relevant with respect to my actual motivations than when I perceive something which is completely irrelevant with respect to my actual motivations.

On the other hand, we want to emphasize that the previous characterization (NecSuffTrig) of “necessary and sufficient preconditions” for belief update is somehow still unsatisfactory. It must be stressed that a more articulated model of the process would require a distinction between belief change and belief rejection. Indeed after having been surprised by the perceived fact φ, not necessarily the agent “decides” to update its beliefs. The agent may simply decide to reject φ if the source of information is evaluated to be unreliable (Castelfranchi 1997). This means that once the agent has been surprised, the possibility of updating beliefs with a perceived fact φ also depends on the reliability assigned to the source of information (reliability of the sensors or reliability of the communicative source etc.). Indeed after being surprised by the perception of φ, I am more prone to revise my knowledge with φ (instead of rejecting the perceptual datum φ) when I consider my sensors to be reliable (“so it is not a hallucination!”) than when I consider my sensors to be unreliable.

5 Conclusion

We have provided in this paper a conceptual and formal clarification of the notion of surprise thanks to the elaboration of the ontology developed in Sect. 3. Each kind of surprise has been associated with a particular phase of the cognitive processing and involves particular kinds of epistemic representations (representation and expectation under scrutiny, perceptual data, presupposed frame, background expectations and beliefs).

We have identified two main kinds of surprise: mismatch-based surprise and astonishment. The first has been defined as the surprise due to a recognized inconsistency between an expectation under scrutiny and a perceived fact. The second has been defined as the surprise due to the recognition of the implausibility of the perceived fact, where this recognition is based either on the retrieval of a background expectation or on some inferential process (classical deduction or probabilistic inference). We have compared our model with existing psychological models of surprise and shown that an analytic investigation of the concept is still missing and that in these models some important aspects of this cognitive phenomenon are ignored.

In the second part of the paper (Sect. 4) we have investigated the role of surprise in triggering belief update. We think in fact that the notion of surprise should be exploited by current formal models of information update. We have provided several justifications of our theoretical position. Indeed, on the one hand we think that in designing cognitive agents we must relax the assumption that in principle any perception produces a reconsideration of pre-existent beliefs and expectations. Since realistic agents are by definition resource bounded (Wasserman 1999; Cherniak 1986), they should not waste time and energy in reasoning out and reconsidering their knowledge on the basis of every piece of information they get. To relax the previous assumption seems indeed a necessary desideratum to bridge the existing gap between formal models of belief change and cognitive theories of belief dynamics. On the other hand, we think that after having relaxed the previous assumption we must look for the cognitive precursors of belief change. We have stressed that surprise is perhaps the most important causal precursor of belief change. We have presented a method to integrate surprise in a formal model of belief update and to investigate its functional role.

More work must be done in order to improve the present model. From a strictly formal point view, we do not yet have prove that our modal logic of surprise is complete. From a more theoretical point view, we have characterized several kinds of informational mental states such as scrutinized expectations (expectations on which the agent focuses its attention) and background expectations (expectations which are available at a mere automatic and effortless level). Moreover, we have characterized mental processes which are responsible for modifying those scrutinized expectations (we have called them retrieve mental operations) by transforming one background expectation into a scrutinized one. We still miss a systematic explanation and formal account of why certain expectations rather than other ones go into background and become accessible. Moreover, we have not explained why certain expectations rather than other ones get scrutinized by the agent.

For the moment we leave unsolved these formal and theoretical problems and we postpone them to future work.